Unicode and NLS vs. file filters

I was wondering where I might find more info on Unicode and NLS issues
impacting on file filter development.

For example, filtering on a specific drive involves hooking one or more
drives. All of the samples I have seen start from a fileName such as:
L"\DosDevices\C:\"

Does such a device even exist in, say, a Japanese version of Win2K?

Another, related question… We can often use pathname separator searches
for various things (separating a pathname from the base file name). I’ll
bet that the pathname separator in Katakana is not the same as for US
English L"\“. Or, a drive letter itself is probably not as simple as
L"C:” in other languages.

Have any of you developed filters that work across different NLS versions
of the OS. How are these issues to be handled?

In a filter, all the pathnames are Unicode. In Unicode, L"\" and L"//"
are always themselves. I believe that \DosDevices is NOT a localized
string, so those pathnames will always work. A user file pathname,
however, can have any Unicode character in it on any copy of NT, 2000,
or XP.

-----Original Message-----
From: Bill [mailto:xxxxx@endwell.net]
Sent: Wednesday, February 05, 2003 11:07 AM
To: File Systems Developers
Subject: [ntfsd] Unicode and NLS vs. file filters

I was wondering where I might find more info on Unicode and NLS issues
impacting on file filter development.

For example, filtering on a specific drive involves hooking one or more
drives. All of the samples I have seen start from a fileName such as:
L"\DosDevices\C:\"

Does such a device even exist in, say, a Japanese version of Win2K?

Another, related question… We can often use pathname separator
searches for various things (separating a pathname from the base file
name). I’ll bet that the pathname separator in Katakana is not the same
as for US English L"\“. Or, a drive letter itself is probably not as
simple as L"C:” in other languages.

Have any of you developed filters that work across different NLS
versions of the OS. How are these issues to be handled?


You are currently subscribed to ntfsd as: xxxxx@basistech.com To
unsubscribe send a blank email to xxxxx@lists.osr.com

For Japanese version, you should use the unicode of yen mark as the
seperator.

Shangwu

“Bill” wrote in message news:xxxxx@ntfsd…
>
> I was wondering where I might find more info on Unicode and NLS issues
> impacting on file filter development.
>
> For example, filtering on a specific drive involves hooking one or more
> drives. All of the samples I have seen start from a fileName such as:
> L"\DosDevices\C:\“
>
> Does such a device even exist in, say, a Japanese version of Win2K?
>
> Another, related question… We can often use pathname separator searches
> for various things (separating a pathname from the base file name). I’ll
> bet that the pathname separator in Katakana is not the same as for US
> English L”\“. Or, a drive letter itself is probably not as simple as
> L"C:” in other languages.
>
> Have any of you developed filters that work across different NLS versions
> of the OS. How are these issues to be handled?
>
>

Is that to say that the (a) method of enumerating drives via \DosDevices\A
and “incrementing” the ‘A’ should work regardless of locale? I want to
believe it (it makes things easier) but I have these doubts…

In a filter, all the pathnames are Unicode. In Unicode, L"\" and L"//"
are always themselves. I believe that \DosDevices is NOT a localized
string, so those pathnames will always work. A user file pathname,
however, can have any Unicode character in it on any copy of NT, 2000,
or XP.

I believe that this is flat-out wrong for NT, 2000, or XP. I can’t speak
to 95.

When you see a Yen sign in a window, you are seeing the ShiftJIS
character which is transcoded to backslash, NOT yen, in Unicode.

-----Original Message-----
From: Shangwu Qi [mailto:xxxxx@fcni.com]
Sent: Wednesday, February 05, 2003 11:37 AM
To: File Systems Developers
Subject: [ntfsd] Re: Unicode and NLS vs. file filters

For Japanese version, you should use the unicode of yen mark as the
seperator.

Shangwu

“Bill” wrote in message news:xxxxx@ntfsd…
>
> I was wondering where I might find more info on Unicode and NLS issues

> impacting on file filter development.
>
> For example, filtering on a specific drive involves hooking one or
> more drives. All of the samples I have seen start from a fileName
> such as: L"\DosDevices\C:\“
>
> Does such a device even exist in, say, a Japanese version of Win2K?
>
> Another, related question… We can often use pathname separator
> searches for various things (separating a pathname from the base file
> name). I’ll bet that the pathname separator in Katakana is not the
> same as for US English L”\“. Or, a drive letter itself is probably
> not as simple as L"C:” in other languages.
>
> Have any of you developed filters that work across different NLS
> versions of the OS. How are these issues to be handled?
>
>


You are currently subscribed to ntfsd as: xxxxx@basistech.com To
unsubscribe send a blank email to xxxxx@lists.osr.com

The question in my mind is the ‘A’. It is remotely possible that
Microsoft will let you create a symlink named with the Chinese character
for turtle, or whatever. Perhaps one of the MS people who helpfully
monitors this list could illuminate that question?

-----Original Message-----
From: Bill [mailto:xxxxx@endwell.net]
Sent: Wednesday, February 05, 2003 12:38 PM
To: File Systems Developers
Subject: [ntfsd] RE: Unicode and NLS vs. file filters

Is that to say that the (a) method of enumerating drives via
\DosDevices\A and “incrementing” the ‘A’ should work regardless of
locale? I want to believe it (it makes things easier) but I have these
doubts…

In a filter, all the pathnames are Unicode. In Unicode, L"\" and
L"//" are always themselves. I believe that \DosDevices is NOT a
localized string, so those pathnames will always work. A user file
pathname, however, can have any Unicode character in it on any copy of

NT, 2000, or XP.


You are currently subscribed to ntfsd as: xxxxx@basistech.com To
unsubscribe send a blank email to xxxxx@lists.osr.com

IIRC Unicode is the same for all languages including Japanese.

----- Original Message -----
From: “Bill”
To: “File Systems Developers”
Sent: Wednesday, February 05, 2003 7:06 PM
Subject: [ntfsd] Unicode and NLS vs. file filters

> I was wondering where I might find more info on Unicode and NLS
issues
> impacting on file filter development.
>
> For example, filtering on a specific drive involves hooking one or
more
> drives. All of the samples I have seen start from a fileName such
as:
> L"\DosDevices\C:\“
>
> Does such a device even exist in, say, a Japanese version of Win2K?
>
> Another, related question… We can often use pathname separator
searches
> for various things (separating a pathname from the base file name).
I’ll
> bet that the pathname separator in Katakana is not the same as for
US
> English L”\“. Or, a drive letter itself is probably not as simple
as
> L"C:” in other languages.
>
> Have any of you developed filters that work across different NLS
versions
> of the OS. How are these issues to be handled?
>
> —
> You are currently subscribed to ntfsd as: xxxxx@storagecraft.com
> To unsubscribe send a blank email to xxxxx@lists.osr.com
>

This will NOT work regardless of locale. Use mount manager APIs to
enumerate the drive letters.

Max

----- Original Message -----
From: “Bill”
To: “File Systems Developers”
Sent: Wednesday, February 05, 2003 8:38 PM
Subject: [ntfsd] RE: Unicode and NLS vs. file filters

> Is that to say that the (a) method of enumerating drives via
\DosDevices\A
> and “incrementing” the ‘A’ should work regardless of locale? I want
to
> believe it (it makes things easier) but I have these doubts…
>
> > In a filter, all the pathnames are Unicode. In Unicode, L"\" and
L"//"
> > are always themselves. I believe that \DosDevices is NOT a
localized
> > string, so those pathnames will always work. A user file pathname,
> > however, can have any Unicode character in it on any copy of NT,
2000,
> > or XP.
>
> —
> You are currently subscribed to ntfsd as: xxxxx@storagecraft.com
> To unsubscribe send a blank email to xxxxx@lists.osr.com
>

I have two FS filters that have both Ansi and Unicode version.
While the separator ISN’T the same, the character used is. For
example,
in Korean the separator is a Korean Won characters (similar to W with
strikethough), which is typed via '' keyboard sign.
My filters attach via IoRegisterFsRegistrationChange, though, so I’m
not
worried:-)

Bill wrote:

I was wondering where I might find more info on Unicode and NLS issues
impacting on file filter development.

For example, filtering on a specific drive involves hooking one or more
drives. All of the samples I have seen start from a fileName such as:
L"\DosDevices\C:\"

Does such a device even exist in, say, a Japanese version of Win2K?

Another, related question… We can often use pathname separator searches
for various things (separating a pathname from the base file name). I’ll
bet that the pathname separator in Katakana is not the same as for US
English L"\“. Or, a drive letter itself is probably not as simple as
L"C:” in other languages.

Have any of you developed filters that work across different NLS versions
of the OS. How are these issues to be handled?


You are currently subscribed to ntfsd as: xxxxx@alfasp.com
To unsubscribe send a blank email to xxxxx@lists.osr.com


Kind regards, Dejan M. www.alfasp.com
E-mail: xxxxx@alfasp.com
Alfa Transparent File Encryptor - Transparent file encryption services.
Alfa File Protector - File protection and hiding library for Win32
developers.
Alfa File Monitor - File monitoring library for Win32 developers.

That’s what it looks like. However, the keyboard key used to type
the
character is the same as the '' one. Although they DISPLAY differently
to the
user, they are the same in a FSF.

Shangwu Qi wrote:

For Japanese version, you should use the unicode of yen mark as the
seperator.

Shangwu

“Bill” wrote in message news:xxxxx@ntfsd…
> >
> > I was wondering where I might find more info on Unicode and NLS issues
> > impacting on file filter development.
> >
> > For example, filtering on a specific drive involves hooking one or more
> > drives. All of the samples I have seen start from a fileName such as:
> > L"\DosDevices\C:\“
> >
> > Does such a device even exist in, say, a Japanese version of Win2K?
> >
> > Another, related question… We can often use pathname separator searches
> > for various things (separating a pathname from the base file name). I’ll
> > bet that the pathname separator in Katakana is not the same as for US
> > English L”\“. Or, a drive letter itself is probably not as simple as
> > L"C:” in other languages.
> >
> > Have any of you developed filters that work across different NLS versions
> > of the OS. How are these issues to be handled?
> >
> >
>
> —
> You are currently subscribed to ntfsd as: xxxxx@alfasp.com
> To unsubscribe send a blank email to xxxxx@lists.osr.com


Kind regards, Dejan M. www.alfasp.com
E-mail: xxxxx@alfasp.com
Alfa Transparent File Encryptor - Transparent file encryption services.
Alfa File Protector - File protection and hiding library for Win32
developers.
Alfa File Monitor - File monitoring library for Win32 developers.

Actually that is the same symbol. Microsoft states in some of its documents
on localization that path separator is always the same in every language. In
Japanese version (or may some other one) a yen mark is in place of
backslash. But that’s fine as people who use Japanese version use yen mark
as a path separator. So, the answer is Yes, you always use L"\" as a path
separator.

Another question that bothered me some time ago was whether I can get a L"/"
symbol as a separator. But looking at the sources (SFilter, FastFat and
alike) I noticed, that they aware only about L"\" one. So, it might be I/O
Manager (or even user mode layer) who translates L"/" to L"\".

----- Original Message -----
From: “Shangwu Qi”
Newsgroups: ntfsd
To: “File Systems Developers”
Sent: Wednesday, February 05, 2003 6:37 PM
Subject: [ntfsd] Re: Unicode and NLS vs. file filters

> For Japanese version, you should use the unicode of yen mark as the
> seperator.
>
> Shangwu
>
> “Bill” wrote in message news:xxxxx@ntfsd…
> >
> > I was wondering where I might find more info on Unicode and NLS issues
> > impacting on file filter development.
> >
> > For example, filtering on a specific drive involves hooking one or more
> > drives. All of the samples I have seen start from a fileName such as:
> > L"\DosDevices\C:\“
> >
> > Does such a device even exist in, say, a Japanese version of Win2K?
> >
> > Another, related question… We can often use pathname separator
searches
> > for various things (separating a pathname from the base file name).
I’ll
> > bet that the pathname separator in Katakana is not the same as for US
> > English L”\“. Or, a drive letter itself is probably not as simple as
> > L"C:” in other languages.
> >
> > Have any of you developed filters that work across different NLS
versions
> > of the OS. How are these issues to be handled?
> >
> >
>
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@vba.com.by
> To unsubscribe send a blank email to xxxxx@lists.osr.com
>

In this context how about other special symbols like ‘:’,‘.’,‘?’ and ‘*’?
Are they also preserved on the
character code level? I’ve noticed that fastfat explicitly checks for ‘*’ in
dircontrol’s search template.

-----Original Message-----
From: Alexey Logachyov [mailto:xxxxx@vba.com.by]
Sent: Thursday, February 06, 2003 4:31 AM
To: File Systems Developers
Subject: [ntfsd] Re: Unicode and NLS vs. file filters

Actually that is the same symbol. Microsoft states in some of its documents
on localization that path separator is always the same in every language. In
Japanese version (or may some other one) a yen mark is in place of
backslash. But that’s fine as people who use Japanese version use yen mark
as a path separator. So, the answer is Yes, you always use L"\" as a path
separator.

Another question that bothered me some time ago was whether I can get a L"/"
symbol as a separator. But looking at the sources (SFilter, FastFat and
alike) I noticed, that they aware only about L"\" one. So, it might be I/O
Manager (or even user mode layer) who translates L"/" to L"\".

----- Original Message -----
From: “Shangwu Qi”
Newsgroups: ntfsd
To: “File Systems Developers”
Sent: Wednesday, February 05, 2003 6:37 PM
Subject: [ntfsd] Re: Unicode and NLS vs. file filters

> For Japanese version, you should use the unicode of yen mark as the
> seperator.
>
> Shangwu
>
> “Bill” wrote in message news:xxxxx@ntfsd…
> >
> > I was wondering where I might find more info on Unicode and NLS issues
> > impacting on file filter development.
> >
> > For example, filtering on a specific drive involves hooking one or more
> > drives. All of the samples I have seen start from a fileName such as:
> > L"\DosDevices\C:\“
> >
> > Does such a device even exist in, say, a Japanese version of Win2K?
> >
> > Another, related question… We can often use pathname separator
searches
> > for various things (separating a pathname from the base file name).
I’ll
> > bet that the pathname separator in Katakana is not the same as for US
> > English L”\“. Or, a drive letter itself is probably not as simple as
> > L"C:” in other languages.
> >
> > Have any of you developed filters that work across different NLS
versions
> > of the OS. How are these issues to be handled?
> >
> >
>
>
>
> —
> You are currently subscribed to ntfsd as: xxxxx@vba.com.by
> To unsubscribe send a blank email to xxxxx@lists.osr.com
>


You are currently subscribed to ntfsd as: xxxxx@borland.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

Folks,

The situation with file name characters, or any other characters, in the
NT kernel is very simple. They are in Unicode. They don’t change from
one ‘localized’ version of the system to another, because the kernel
doesn’t change.

Support for the old multi-byte code pages exists only out in user32
land.

Several writers on this list have been, I’m afraid, a little confused
about this, or perhaps projecting from the behavior of 9x to NT.

Pathname separators are particularly confusing, due to a historical
artifact. In the original Microsoft Japanese code page, the Yen sign
occupied the same binary code point as the \ in ASCII. Back in
pre-directory DOS, this was deemed clever.

When pathnames came along, Microsoft simply used the Yen (and Won) as
the separator. Why? Because the official \ character in the JIS/ShiftJIS
set was double-byte, and they were, naturally, not too enthusiastic. Net
result: JP windows users learned to think of Yen and \ as sort-of the
same thing.

Later, much, much, later, Unicode came along. Uncode has separate code
points for the single-width , the double-width , the single-width Yen,
and the double-width Yen. Microsoft, at least initially, coded Windows
to treat Yen as equivalent to the Unicode , NOT to the Unicode Yen!
That made pathnames work compatibly in NT in Japanese.

Of late, more and more Microsoft bits of software have begun to
distinguish the Yen from the , but no one is going to change the
mapping in CP932.

So, for you file system developers out there, a \ is a \ is a . Accept
no substitutes, and don’t worry about Yen signs. User-mode programmers
have other headaches.