Copy UNICODE_STRING to *CHAR

Zorba_Kov · October 10, 2007, 3:38am

Hi!

Maybe this question is not in the right forum, but I need to use it in my driver, so I ask it here…

I am trying to copy a string from UNICODE_STRING to *CHAR (PCHAR).

I tried to do it this way:

UNICODE_STRING us_str;
RtlInitUnicodeString(&us_str, L"A String");

PCHAR pchar_str;
RtlCopyBytes(pchar_str, us_str.Buffer, us_str.Length);

But that doesn’t work… Can anyone help me with that?

I would be very thankful!

Jan_Milan · October 10, 2007, 4:26am

Hi,
you should use RtCopyMemory in drivers (its written in documentation for
RtlCopyBytes).
PCHAR is not same as PWCHAR (Look at RtlUnicodeStringToAnsiString).
It looks like C++ code.

wrote in message news:xxxxx@ntfsd…
> Hi!
>
> Maybe this question is not in the right forum, but I need to use it in my
driver, so I ask it here…
>
> I am trying to copy a string from UNICODE_STRING to *CHAR (PCHAR).
>
> I tried to do it this way:
>
> UNICODE_STRING us_str;
> RtlInitUnicodeString(&us_str, L"A String");
>
> PCHAR pchar_str;
> RtlCopyBytes(pchar_str, us_str.Buffer, us_str.Length);
>
> But that doesn’t work… Can anyone help me with that?
>
> I would be very thankful!
>

Zorba_Kov · October 10, 2007, 4:56am

I tried it now, and in the end of the operation the string that is in pchar_str is "".

What am I doing wrong?

OSR_Community_User · October 10, 2007, 5:38am

You cannot copy unicode strings to pchar because unicode strings are two byte per character. This is the reason why you see only the first character on pchar.

A na?ve and dirty loop like this would do the work for standard western languages:

//Note: in first place allocate space for your pchar!!!
For(i=0;i<us_str.length> Pchar[i]=buffer[i];
Pchar[i]=0; //Don’t forget the final zero!

Note that many unicode characters do not have direct translation to single byte characters so you may end with a pchar string that is not equivalent to the unicode string.

If you want a compact, but less performant way, you can use a sprintf, like this:

sprintf(pchar, “%wZ”, &us_sts);

If you want all characters properly translated according to machine local settings use the function:

RtlUnicodeStringToAnsiString.

-----Mensaje original-----
De: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] En nombre de xxxxx@gmail.com
Enviado el: mi?rcoles, 10 de octubre de 2007 9:39
Para: Windows File Systems Devs Interest List
Asunto: [ntfsd] Copy UNICODE_STRING to *CHAR

Hi!

Maybe this question is not in the right forum, but I need to use it in my driver, so I ask it here…

I am trying to copy a string from UNICODE_STRING to *CHAR (PCHAR).

I tried to do it this way:

UNICODE_STRING us_str;
RtlInitUnicodeString(&us_str, L"A String");

PCHAR pchar_str;
RtlCopyBytes(pchar_str, us_str.Buffer, us_str.Length);

But that doesn’t work… Can anyone help me with that?

I would be very thankful!

—
NTFSD is sponsored by OSR

For our schedule debugging and file system seminars
(including our new fs mini-filter seminar) visit:
http://www.osr.com/seminars

You are currently subscribed to ntfsd as: xxxxx@pandasecurity.com
To unsubscribe send a blank email to xxxxx@lists.osr.com</us_str.length>

Jan_Milan · October 10, 2007, 5:47am

You probably still mixing PCHAR with PWCH.
If you want to get UNICODE_STRING to PCHAR you should do sth like this:

//Its just illustration which i just write so there maybe some errors even
compilation
NTSTATUS status;
PCHAR pchar_str;
UNICODE_STRING us_str = RTL_CONSTANT_STRING(L"A String");
ANSI_STRING as_str;

ASSERT(KeGetCurrentIrgl() == PASSIVE_LEVEL);

as_str.Length = 0;
as_str.MaximumLength = RtlUnicodeStringToAnsiSize(&us_str);

pchar_str = ExAllocatePoolWithTag(PagedPool, as_str.MaximumLength ,
SOME_TAG);
if (pchar_str == NULL)
return STATUS_INSUFFICIENT_RESOURCES;

as_str.Buffer = pchar_str;

status = RtlUnicodeStringToAnsiString(&as_str, &us_str, FALSE);

if (NT_SUCCESS(status))
{
pchar_str[as_str.MaximumLength] = ANSI_NULL; // im not sure if null is
already there

// now you could use pchar_str

}

ExFreePoolWithTag(pchar_str, SOME_TAG);

wrote in message news:xxxxx@ntfsd…
> I tried it now, and in the end of the operation the string that is in
pchar_str is "".
>
> What am I doing wrong?
>

Zorba_Kov · October 10, 2007, 6:25am

Jan Milan,

What you told me to do works great, though I have a warning about the line:

as_str.MaximumLength = RtlUnicodeStringToAnsiSize(&us_str);

I says that I assign a ULONG to USHORT, and there can be a loss of data.

Anyway, it works!

But now I have another question: After I initialize the pchar_str I send it to a user mode application and print it there. When I print it there I get some strange characters. That happens when I send something in hebrew. My question is: what is the size of each character in pchar_str and how can I get it in my application without loss of data?

OSR_Community_User · October 10, 2007, 9:54am

I scanned the replies and didn’t see any mention of this so I’ll throw this
in since you state you’re seeing garbage characters:

Unicode strings are not guaranteed to be null terminated. In fact, in my
experience, most of them aren’t and it really depends on where they’re
coming from. This is especially true if you’re reading strings from the
registry.

The best way to handle this is to allocate a buffer that is equal to the
source UNICODE_STRING ‘Length’ field plus sizeof(WCHAR). Then copy ‘Length’
bytes from the Unicode string ‘Buffer’ to your WCHAR buffer (after you’ve
zeroed out your buffer or set the last WCHAR to nil).

If you are putting this in your own UNICODE_STRING variable, set ‘Buffer’
equal to your allocated buffer, ‘Length’ to the ‘Length’ from the source
Unicode string and ‘MaximumLength’ to the allocation size of your buffer.
This means that ‘MaximumLength’ will be two bytes longer than ‘Length’.

Anyway, this approach works for me. It’s simple and easy.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@gmail.com
Sent: Wednesday, October 10, 2007 6:26 AM
To: Windows File Systems Devs Interest List
Subject: RE:[ntfsd] Copy UNICODE_STRING to *CHAR

Jan Milan,

What you told me to do works great, though I have a warning about the line:

as_str.MaximumLength = RtlUnicodeStringToAnsiSize(&us_str);

I says that I assign a ULONG to USHORT, and there can be a loss of data.

Anyway, it works!

But now I have another question: After I initialize the pchar_str I send it
to a user mode application and print it there. When I print it there I get
some strange characters. That happens when I send something in hebrew. My
question is: what is the size of each character in pchar_str and how can I
get it in my application without loss of data?

NTFSD is sponsored by OSR

For our schedule debugging and file system seminars
(including our new fs mini-filter seminar) visit:
http://www.osr.com/seminars

You are currently subscribed to ntfsd as: xxxxx@msn.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

Jan_Milan · October 10, 2007, 10:52am

Hi,
I’m not sure if it is 1-byte or multibyte characters but you could easily
examine that in debug mode by your own. There is also RtlUnicodeToMultiByteN
which should return multibyte string but from doc it seem to do same job as
RtlUnicodeStringToAnsiString.
The problem may be just in console where are you printing so try different
method to print it. You could also enhance your app to work with widechar.
If it is not possible, you could add some interface in user mode which
convert string using WideCharToMultiByte api function.

Anyway in code I have sent you is at least one mistake:

pchar_str[as_str.MaximumLength] = ANSI_NULL; // This write after allocated
memory !!!

should be

pchar_str[as_str.Length] = ANSI_NULL; // This is correct since
RtlUnicodeStringToAnsiSize return size for ZERO-TERMINATED string so
MaximumLength > Length

You could get rid of warning you see by this approach
as_str.MaximumLength = (USHORT) min(MAXUSHORT,
RtlUnicodeStringToAnsiSize(&us_str));

wrote in message news:xxxxx@ntfsd…
> Jan Milan,
>
> What you told me to do works great, though I have a warning about the
line:
>
> as_str.MaximumLength = RtlUnicodeStringToAnsiSize(&us_str);
>
> I says that I assign a ULONG to USHORT, and there can be a loss of data.
>
> Anyway, it works!
>
> But now I have another question: After I initialize the pchar_str I send
it to a user mode application and print it there. When I print it there I
get some strange characters. That happens when I send something in hebrew.
My question is: what is the size of each character in pchar_str and how can
I get it in my application without loss of data?
>

Zorba_Kov · October 10, 2007, 11:09am

Thank you very much for the help!

David_J_Craig · October 10, 2007, 12:28pm

You have a design problem. It is not logical nor desirable to do Unicode to
ansi translations in the kernel when the target who receives those strings
is a user mode program. It is more difficult to do it in drivers with
restrictions on allocating memory, IRQL, pageable translation tables,
unknown user mode process codepage, and limited support in kernel mode when
compared to CString, etc in user mode.

wrote in message news:xxxxx@ntfsd…
> Thank you very much for the help!
>

Zorba_Kov · October 10, 2007, 2:29pm

How can I send it as Unicode?

OSR_Community_User · October 10, 2007, 2:50pm

Do you mean how to send a UNICODE_STRING or how to send PWCHAR? I assume that you mean the former; if not, the answer is that you send it just as you currently send you PCHAR. If I have this wrong, or you don’t know how to do this, please write back after posting your code.

If you mean the latter, a UNICODE_STRING is not what most user mode applications are expecting or wish to deal with, and there isn’t even a declaration for it in Windows.h. Most importantly, it is not a flat structure that can be passed directly. That is, it is not one contiguous chunk of data, because it contains an embedded pointer (Buffer). Accordingly, you have two choices: (1) you can flatten it yourself, or (2) you can just pass the contents of Buffer, probably with an extra step to ensure that it is UNICODE_NULL terminated. Personally, I don’t think that (1) makes much sense, as you would end up with something that isn’t even a UNICODE_STRING, so I would go with (2). No matter what you’re user mode application is expecting, I would make sure that the string you pass is UNICODE_NULL terminated; not to do so is just asking for trouble. In practice, this will mean that you may have to allocate a temporary buffer large enough to hold the characters, plus one UNICODE_NULL, copy the actual data to that buffer, and terminate it with a UNICODE_NULL. If MaxiumLength >= Length + 1, you can add a UNICODE_NULL directly (if it isn’t already so), assuming that the string is yours to modify, and skip the allocation and copy, and just pass buffer. I don’t know what your code looks like, but the size of the string (in bytes, including the terminating UNICODE_NULL) will probably get passed back in one of the arguments to DeviceIoControl, via the Information member of the IO_STATUS_BLOCK. If not, it is just a regular PWCHAR at this point, so it should not pose a problem for regular user mode use.

If you do want to pass the entire contents of a UNICODE_STRING, you just need to allocate a buffer large enough to hold sizeof(Length) + sizeof(MaximumLength) + sizeof(Buffer or the temporary buffer you created above to ensure UNICODE_NULL termination), and then copy the bits to the appropriate offsets of the temporary buffer.

Good luck,

mm

David_J_Craig · October 10, 2007, 3:06pm

You should have an IoCtl to provide the answer. Check the buffer size based
upon the length of the Unicode string plus the two bytes for the terminator.
Zero the output buffer. Copy memory for the length * sizeof(WCHAR) from the
Unicode.Buffer to the IoCtl buffer. Simple programming 101 for kernel work.

wrote in message news:xxxxx@ntfsd…
Do you mean how to send a UNICODE_STRING or how to send PWCHAR? I assume
that you mean the former; if not, the answer is that you send it just as you
currently send you PCHAR. If I have this wrong, or you don’t know how to do
this, please write back after posting your code.

If you mean the latter, a UNICODE_STRING is not what most user mode
applications are expecting or wish to deal with, and there isn’t even a
declaration for it in Windows.h. Most importantly, it is not a flat
structure that can be passed directly. That is, it is not one contiguous
chunk of data, because it contains an embedded pointer (Buffer).
Accordingly, you have two choices: (1) you can flatten it yourself, or (2)
you can just pass the contents of Buffer, probably with an extra step to
ensure that it is UNICODE_NULL terminated. Personally, I don’t think that
(1) makes much sense, as you would end up with something that isn’t even a
UNICODE_STRING, so I would go with (2). No matter what you’re user mode
application is expecting, I would make sure that the string you pass is
UNICODE_NULL terminated; not to do so is just asking for trouble. In
practice, this will mean that you may have to allocate a temporary buffer
large enough to hold the characters, plus one UNICODE_NULL, copy the actual
data to that buffer, and terminate it with a UNICODE_NULL. If MaxiumLength
>= Length + 1, you can add a UNICODE_NULL directly (if it isn’t already so),
assuming that the string is yours to modify, and skip the allocation and
copy, and just pass buffer. I don’t know what your code looks like, but the
size of the string (in bytes, including the terminating UNICODE_NULL) will
probably get passed back in one of the arguments to DeviceIoControl, via the
Information member of the IO_STATUS_BLOCK. If not, it is just a regular
PWCHAR at this point, so it should not pose a problem for regular user mode
use.

If you do want to pass the entire contents of a UNICODE_STRING, you just
need to allocate a buffer large enough to hold sizeof(Length) +
sizeof(MaximumLength) + sizeof(Buffer or the temporary buffer you created
above to ensure UNICODE_NULL termination), and then copy the bits to the
appropriate offsets of the temporary buffer.

Good luck,

mm

Zorba_Kov · October 10, 2007, 4:25pm

Alright, I will try what you said…

Thank you very much!

OSR_Community_User · October 10, 2007, 11:40pm

UNICODE_STRING::Buffer is already a PWCHAR, just not zero-terminated.

If you need PCHAR and not PWCHAR, then you need to convert the string to
ANSI, use RtlXxx for this.

If you just need zero-termination - then allocate UStr.Length +
sizeof(WCHAR) bytes, set pWStr to this allocation, and then:

RtlCopyMemory(pWStr, UStr.Buffer, UStr.Length);
pWStr[UStr.Length / sizeof(WCHAR)] = UNICODE_NULL;

–
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

wrote in message news:xxxxx@ntfsd…
> Hi!
>
> Maybe this question is not in the right forum, but I need to use it in my
driver, so I ask it here…
>
> I am trying to copy a string from UNICODE_STRING to *CHAR (PCHAR).
>
> I tried to do it this way:
>
> UNICODE_STRING us_str;
> RtlInitUnicodeString(&us_str, L"A String");
>
> PCHAR pchar_str;
> RtlCopyBytes(pchar_str, us_str.Buffer, us_str.Length);
>
> But that doesn’t work… Can anyone help me with that?
>
> I would be very thankful!
>

OSR_Community_User · October 10, 2007, 11:40pm

I would never ever use PCHAR in modern software actually.

Use only PWSTR in both kernel and user mode code, this is a very good idea.

–
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

wrote in message news:xxxxx@ntfsd…
> Jan Milan,
>
> What you told me to do works great, though I have a warning about the line:
>
> as_str.MaximumLength = RtlUnicodeStringToAnsiSize(&us_str);
>
> I says that I assign a ULONG to USHORT, and there can be a loss of data.
>
> Anyway, it works!
>
> But now I have another question: After I initialize the pchar_str I send it
to a user mode application and print it there. When I print it there I get some
strange characters. That happens when I send something in hebrew. My question
is: what is the size of each character in pchar_str and how can I get it in my
application without loss of data?
>

OSR_Community_User · October 10, 2007, 11:41pm

>If you mean the latter, a UNICODE_STRING is not what most user mode

applications are expecting or wish to deal with, and there isn’t even a
declaration for it in Windows.h. Most importantly, it is not a flat structure
that can
be passed directly.

Just return the PWSTR array as output buffer of the IOCTL, and set Irp

IoStatus.Information to UStr.Length.

No need in ever returning UStr.MaximumLength to user mode.

–
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com