File Encryption, Memory mapping files

Hi All.
I’m working on on-the-fly file encryption. I’m really confused by memory mapping files.

  1. there is a real folder and a virtual folder.
  2. files on disk are ciphertext.
  3. when files are opened in real folder, they are ciphertext;
  4. when files are opened in virtual folder, they are plaintext.
    Everything is OK except the memory mapping files. For example notepad.exe, when it reads data from cache, minifilter cannot capture it. So I cannot perform encryption or decryption for it. It’s out of control.

I appreciate any suggestion or links of threads/post.
Thanks in advance.

It is not out of control. It is a deficiency of the minifilter you are developing.

Memory mapping means paging IO. Your design somehow skips paging IO.

Keep in mind that after a page has been brought from a disk to memory( by request from MM or CC) it is not being read second time on each access and/or file mapping. The system keeps pages to accelerate future access. The system frees LRU pages only when there is a pressure for free pages. The Cache Manager and the Memory Manager share the same pages.

Thanks Slava.

I agree with you that memory mapping means paging IO when no data is cached.

If a file is opened by notepad.exe when there is data cached, I think it is out of control.

I am a novice in this field. Are there any methods that we can use to intercept the operations when there is data cached?

Thanks for any hints, advice.

> If a file is opened by notepad.exe when there is data cached, I think it

is out of control.

Don’t know what you mean by “out of control”. As Slava said the file is in
cache notepad just fishes the content out of the cache. This implies the
the contents of the cache are wrong. Again as Slava said the only way that
happens is because the data returned from paging reads was wrong.

Are there any methods that we can use to intercept the operations when
there is data cached?

No. It’s a read of a bit of virtual memory. You can even look at it in the
debugger. !finddata is your friend.

I am a novice in this field.

Depending on what you mean by “this field”, I would be worried if I were
you. Whilst not the most complicated part of filesystems, what you are
attempting is in the top decile. Equally in windows, file systems are
probable in the top quartile of complication. Systems programming is
different from applications programming. But all in all this it not
friendly territory for a novice.

Thanks for any hints, advice.
Get training.

Hi Rod, thanks for your reply.

Don’t know what you mean by “out of control”.

Sorry I did not express clearly. I mean if Apps read memory mapping files from cache, minifilter cannot intercept these operations, as you said “No. It’s a read of a bit of virtual memory.”

>But all in all this it not friendly territory for a novice.

Thanks. Every seasoned one comes from a novice :slight_smile:

Rod wasn’t kidding - get some training. “It all works except for
notepad.exe…” a classic on this list. Its also a hard message to hear -
this is a very difficult problem to solve and most (companies that is)
solve it by purchasing a solution from our list host or one of the select
handful of alternative vendors and plugging in their own encryption and
management.

The essential difference between how the NT kernel works and how Unix was
designed is that NT caches streams of data (above the file system), whereas
on Unix data is cached at the block layer. Thus you as a file system
engineer have to be aware of cache coherency details, which significantly
increases the complexity of doing any sort of file system filter that
transforms data - such as encryption.

t.

On Fri, Jul 28, 2017 at 3:55 AM, xxxxx@gmail.com > wrote:

> Hi Rod, thanks for your reply.
>
> >Don’t know what you mean by “out of control”.
>
> Sorry I did not express clearly. I mean if Apps read memory mapping files
> from cache, minifilter cannot intercept these operations, as you said “No.
> It’s a read of a bit of virtual memory.”
>
> >>But all in all this it not friendly territory for a novice.
>
> Thanks. Every seasoned one comes from a novice :slight_smile:
>
>
> —
> NTFSD is sponsored by OSR
>
>
> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> software drivers!
> Details at http:
>
> To unsubscribe, visit the List Server section of OSR Online at <
> http://www.osronline.com/page.cfm?name=ListServer&gt;
></http:>

Thanks Tracy.

Kernel mode programing is very different from user mode programing, so many new concepts, kernel mode objects, blue screen… But things will be better and better.

Thanks for your kind reply.

This is true only for ancient *NIX kernels. Modern kernels use the same technique as NT with caching backed by file mapping structures.

For example below is a call stack from my test machine running the Linux 4.12.2 kernel when ext4 read operation ext4_file_read_iter called the “Linux cache manager” ( do_generic_file_read -> page_cache_sync_readahead ) to bring data in the cache backed by mapped file structures( struct address_space ) when processing the read() system call.

This resulted in a recursive call to mapping->a_ops->readpages into a file system’s ext4_readpages . This is an analogue of a cached read in NT. Mac OS X uses the same caching by file mapping technique borrowed from BSD.

Thread 2 hit Breakpoint 9, ext4_readpages (file=0xffff88001d59b300, mapping=0xffff88001d1d56c0, pages=0xffffc90000817c30, nr_pages=1) at …/fs/ext4/inode.c:3308
3308 WARN_ON(page_has_buffers(page) && buffer_jbd(page_buffers(page)));
(gdb) bt
#0 ext4_readpages (file=0xffff88001d59b300, mapping=0xffff88001d1d56c0, pages=0xffffc90000817c30, nr_pages=1) at …/fs/ext4/inode.c:3308
#1 0xffffffff811b6288 in read_pages (gfp=, nr_pages=, pages=, filp=, mapping=) at …/mm/readahead.c:121
#2 __do_page_cache_readahead (mapping=, filp=, offset=1, nr_to_read=, lookahead_size=) at …/mm/readahead.c:199
#3 0xffffffff811b64b8 in ra_submit (ra=, ra=, ra=, filp=, mapping=) at …/mm/internal.h:66
#4 ondemand_readahead (mapping=0xffff88001d1d56c0, ra=0xffff88001d59b398, filp=0xffff88001d59b300, hit_readahead_marker=, offset=0, req_size=) at …/mm/readahead.c:478
#5 0xffffffff811b678e in page_cache_sync_readahead (mapping=, ra=, filp=, offset=, req_size=) at …/mm/readahead.c:510
#6 0xffffffff811a7a62 in do_generic_file_read (written=, iter=, ppos=, filp=) at …/mm/filemap.c:1813
#7 generic_file_read_iter (iocb=0x20000, iter=) at …/mm/filemap.c:2069
#8 0xffffffff812d1386 in ext4_file_read_iter (iocb=0xffff88001d59b300, to=0xffff88001d1d56c0) at …/fs/ext4/file.c:70
#9 0xffffffff81237680 in call_read_iter (file=, iter=, kio=) at …/include/linux/fs.h:1728
#10 new_sync_read (ppos=, len=, buf=, filp=) at …/fs/read_write.c:440
#11__vfs_read (file=0xffff88001d59b300, buf=, count=, pos=0xffffc90000817f18) at …/fs/read_write.c:452
#12 0xffffffff81237cc3 in vfs_read (file=0xffff88001d59b300, buf=0x7fb92a0cb000 <error: cannot access memory at address>, count=, pos=0xffffc90000817f18)
at …/fs/read_write.c:473
#13 0xffffffff81239385 in SYSC_read (count=, buf=, fd=) at …/fs/read_write.c:589
#14 SyS_read (fd=, buf=140433251151872, count=131072) at …/fs/read_write.c:582
#15 0xffffffff818aaffb in entry_SYSCALL_64 () at …/arch/x86/entry/entry_64.S:203

(gdb) f 4
#4 ondemand_readahead (mapping=0xffff88001d1d56c0, ra=0xffff88001d59b398, filp=0xffff88001d59b300, hit_readahead_marker=, offset=0, req_size=) at …/mm/readahead.c:478
478 return ra_submit(ra, mapping, filp);

(gdb) p/x mapping
$13 = 0xffff88001d1d56c0

(gdb) p/x *mapping
$14 = {host = 0xffff88001d1d5548, page_tree = {gfp_mask = 0x1180020, rnode = 0x0}, tree_lock = {{rlock = {raw_lock = {val = {counter = 0x0}}}}}, i_mmap_writable = {counter = 0x0}, i_mmap = {
rb_node = 0x0}, i_mmap_rwsem = {count = {counter = 0x0}, wait_list = {next = 0xffff88001d1d56f0, prev = 0xffff88001d1d56f0}, wait_lock = {raw_lock = {val = {counter = 0x0}}}, osq = {tail = {
counter = 0x0}}, owner = 0x0}, nrpages = 0x0, nrexceptional = 0x0, writeback_index = 0x0, a_ops = 0xffffffff81a3a680, flags = 0x0, private_lock = {{rlock = {raw_lock = {val = {
counter = 0x0}}}}}, gfp_mask = 0x14200ca, private_list = {next = 0xffff88001d1d5740, prev = 0xffff88001d1d5740}, private_data = 0x0}

(gdb) ptype mapping
type = struct address_space {
struct inode *host;
struct radix_tree_root page_tree;
spinlock_t tree_lock;
atomic_t i_mmap_writable;
struct rb_root i_mmap;
struct rw_semaphore i_mmap_rwsem;
unsigned long nrpages;
unsigned long nrexceptional;
unsigned long writeback_index;
const struct address_space_operations *a_ops;
unsigned long flags;
spinlock_t private_lock;
gfp_t gfp_mask;
struct list_head private_list;
void *private_data;
} *</error:>