test display driver performance

thunder_zhang · December 12, 2014, 1:35pm

Hello ,

I want to test display driver performance . And want to test 2D . I have two cards in system , two ports each , connected to dell 30in monitors , 2560 1600 resolution, four total .

I use C# datagridview , maximize to all screen s, many rows and column s, grid in virtualmode = true . And background thread calls UpdateCellValue method. Cellvalueneeded event handler supplies random string .

Problem as follows , I create many data grids , eventually earlier created grids is stop updating . My think as follows: UpdateCellValue invalidates rect in grid, send WM_PAINT . But , so many WM_PAINT messages, overflow message queue for GUI thread , grid can only draw cells in GUI thread , get behind .

How to achieve greater performance ? Use technology like direct2D, OGL ?

Tim_Roberts · December 12, 2014, 3:01pm

xxxxx@gmail.com wrote:

I want to test display driver performance . And want to test 2D . I have two cards in system , two ports each , connected to dell 30in monitors , 2560 1600 resolution, four total .

I use C# datagridview , maximize to all screen s, many rows and column s, grid in virtualmode = true . And background thread calls UpdateCellValue method. Cellvalueneeded event handler supplies random string.

That’s not testing display driver performance. The minimal display
driver time is completely swamped by the overhead of managing the grid
and generating and processing all of those window messages.

How to achieve greater performance ? Use technology like direct2D, OGL ?

Well, what is it that you want to measure, exactly? Any why? Are you
testing the hardware speed? Do you want to know how many textured
triangles you can show per second? Do you want to know how many filled
rectangles you can display per second?

My guess is that you don’t really know what you want to measure. In
that case, you can’t possibly generate any meaningful results.

There are already a billion graphics benchmarks in the world. Surely
one of those could do what you need.

–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

thunder_zhang · December 12, 2014, 3:38pm

Hi , Roberts ,

Underlying requirement is display real time grid with many columns , rows at high FPS . Not benchmark itself . My boss says bottleneck comes from display driver or system , because GDI not accelerated enough in Windows 7 , as compare to directx . How to understand GDI performance limitation vs WM_PAINT process speed , etc ?

May be benchmark not right approach . Really want to know , best way to achieve high FPS grid ? Or which technology to use ? GDI, OGL, directX, direct2D, Qt ?

thunder_zhang · December 12, 2014, 4:24pm

Maybe is off-topic for this list , then . Apologize if so .

Tim_Roberts · December 12, 2014, 5:56pm

xxxxx@gmail.com wrote:

Underlying requirement is display real time grid with many columns , rows at high FPS.

How high? I mean, let’s be realistic. If you have a grid with 20
columns and 40 rows filled with numbers, that’s 8,000 individual
entries. No human being is going to be able to extract anything useful
from such a grid if it updates faster than about 3 times a second.

A grid is a tough thing to optimize, because each string has to be a
separate call into the display driver, which means a user/kernel and
kernel/user transition. The display driver overhead is insignificant,
as is the hardware drawing time. You couldn’t do anything about that,
anyway.

Not benchmark itself . My boss says bottleneck comes from display driver or system , because GDI not accelerated enough in Windows 7 , as compare to directx . How to understand GDI performance limitation vs WM_PAINT process speed , etc ?

Your boss is smoking something. You’re talking about rectangles and
text. That’s not what people worry about when they talk about “graphics
acceleration”. There haven’t been any advances in that for a decade.

Do you actually have an application that is not responsive enough? Are
you using an off-the-shelf grid, or is it one you’ve designed?

May be benchmark not right approach . Really want to know , best way to achieve high FPS grid ?

Oh, a benchmark probably IS the right approach, but you need to be
looking at the whole application. I will bet you real dollars that any
bottleneck you find is in your grid control, and nowhere near the
display driver. You need to do some genuine profiling. Spend the time
to figure out exactly where your application really is spending its
time. There are good tools for that, and the results are usually
surprising.

Or which technology to use ? GDI, OGL, directX, direct2D, Qt ?

Qt adds yet another layer. It can make programming easier, but it
certainly will not improve performance. OpenGL sucks at text, as does
Direct3D. There is a new technology called DirectText that is designed
to work with Direct2D to allow more direct interaction with the graphics
card, but I firmly believe you would be wasting your time and money to
consider that step before you have done a thorough profiling to
determine where the time is really being spent.

Seriously. Programmers are TERRIBLE at guessing where the bottlenecks
are. I recently did a real-time telemetry application where it turned
out that 30% of the CPU time was being spent in the arctangent routine.
We were able to replace that with a table-based lookup, and the
application’s responsiveness went WAY up.

And yes, unfortunately, this is not really the right forum for this
question.

–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Jan_Bottorff · December 12, 2014, 10:42pm

>You’re talking about rectangles and

text. That’s not what people worry about when they talk about “graphics
acceleration”. There haven’t been any advances in that for a decade.

I agree there hasn’t been any real hardware improvements in 2D rendering
for a while, but my understanding is web browsers are a good example of
where hardware accelerated text and rectangle rendering help performance a
lot. There is also I believe a relatively new API to offload this to the
GPU, DirectWrite.

In the early days of hardware graphics acceleration, caching the font as
bitmaps in graphics card memory and using the graphics chip to do the
million little BitBlts needed for text was a big improvement. When good 3D
rendering hardware and APIs happened (i.e. Direct3D) text rendering was no
longer hardware accelerated so performance suffered. DirectWrite is
oriented to get hardware accelerated 2D again, perhaps as a layer over
Direct3D. Thru the magic of z-buffers, I believe you can even offload the
hit testing needed to make user interaction fast on complex 2D renderings,
see
http://msdn.microsoft.com/en-us/library/windows/desktop/dd756613(v=vs.85).a
spx.

See http://msdn.microsoft.com/en-us/magazine/dn451448.aspx for some more
info on DirectWrite.

Jan

thunder_zhang · December 13, 2014, 3:32pm

Hi , Roberts ,

Appreciate guru .

We are using C# winforms datagridview class . Built - in .

Update rate of 4X / sec is okay . Problem is follows . Grid is maximized on many large monitors . When data update rate get high , too many calls to UpdateCellValue() , window message queue overflows , other windows stop updating also . Or , mouse cursor move jagged , user complains “lag” . Also try call Invalidate () on whole grid , instead of cell , but app freeze during redraw , mouse smooth , jerk , smooth , jerk …

I perform following test : create blank UserControl , override OnPaint(). Divide into 50 x 10 rects , to simulate grid . In OnPaint () , fill all rects with random color . Then call Invalidate () from background thread @ 10X / sec . Result , CPU usage very low . Form responsive .

Then , second test , fill rectangle , and , draw random string into rectangle . Now CPU usage very high , one core 100 % . Seems your idea “A grid is a tough thing to optimize, because each string has to be a separate call into the display driver, which means a user/kernel and kernel/user transition.” .

If possible string values known in advance for each cell . Can pre draw or cache ? Use Memory DC ? Botorff suggest “early days” . Works now ?

Tim_Roberts · December 14, 2014, 12:56am

On Dec 13, 2014, at 12:32 PM, xxxxx@gmail.com wrote:

We are using C# winforms datagridview class . Built - in .

…the implementation of which is completely buried in the framework.

…Grid is maximized on many large monitors . When data update rate get high , too many calls to UpdateCellValue() , window message queue overflows , other windows stop updating also . Or , mouse cursor move jagged , user complains “lag” . Also try call Invalidate () on whole grid , instead of cell , but app freeze during redraw , mouse smooth , jerk , smooth , jerk …

I perform following test : create blank UserControl , override OnPaint(). Divide into 50 x 10 rects , to simulate grid . In OnPaint () , fill all rects with random color . Then call Invalidate () from background thread @ 10X / sec . Result , CPU usage very low . Form responsive .

Invalidate causes EVERY cell to be repainted. Do ALL of the cells change in every update? Perhaps you need to implement your own “grid” that simply draws each string when it needs to change.

I’m not sure the problem gets any better with a memory DC. You can certainly draw your strings into a bitmap, but then you have to blit the bitmap to the screen. Blitting a large rectangle isn’t going to be any quicker than drawing text strings.
—
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

thunder_zhang · December 14, 2014, 2:10am

Hi , Roberts ,

Not all grid cell changing every time . Some most of time , a few some of time , some infrequent . Invalidate() was for worst case test . Can optimize with draw smarter . But must avoid flood WM_PAINT from many small Invalidate ( Rect ).

I perform second test as follow . Recall , old test , draw string with Graphics.DrawString in OnPaint , use 25 % CPU ( 100 % one core ) , window jerks .

New test , create 1,000 Bitmaps at startup , draw random strings into . In OnPaint , select Bitmap at random , copy to Graphics with FillRectangle ( TextureBrush ) . CPU 5 % this test . Therefore , TextureBrush blit faster than GDI+ draw string ?

thunder_zhang · December 14, 2014, 9:12pm

Hi, M M ,

Sorry , your post in own thread on NTDEV web site , did not see at first .

Grid is in virtual mode . Mean , I give number of columns , rows , supply data in CellValueNeeded . No data binding . Bound grid performance much worse , yet .

I agree your think . DataGridView generalized solution . One size fits all . I thought , test showed , if aggressively pre create cell value graphics , etc , can greatly increase performance .

Maxim_S_Shatskih · December 15, 2014, 12:19am

> Underlying requirement is display real time grid with many columns , rows at high FPS .

Funny.

Do you want some text strings and figures to flash before the user’s eyes, causing some psychological effects in him/her, like the fabulous “25ths frame effect?”

Yes, DBGrid performance can be an issue, but this is only about scrolling speed, and not UpdateCellValue speed.

And, surely, this perf depends on DBGrid+underlying DB query impl (including the stuff like whether the cursor is keyset or dynamic, and what client DLLs are used to access the DB) and not the display driver.

More so: in mid-90ies, there were DBGrid controls fast enough to be satisfactory, even on that HW of 20 years ago.

–
Maxim S. Shatskih

Microsoft MVP on File System And Storage

xxxxx@storagecraft.com
http://www.storagecraft.com

Maxim_S_Shatskih · December 15, 2014, 12:24am

>Or which technology to use ? GDI, OGL, directX, direct2D, Qt ?

You forgot WPF/SilverLight on this list

–
Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

Maxim_S_Shatskih · December 15, 2014, 12:26am

Are you really sure that cdd.dll does not cache the font glyphs in D3D surfaces the way the venerable S3 Trio chip was doing in late 90ies?

longer hardware accelerated so performance suffered. DirectWrite is
oriented to get hardware accelerated 2D again, perhaps as a layer over
Direct3D.

So is Direct2D (Win7+), which is kinda a low-level GdiPlus-style stuff which uses the underlyuing HW facilities (surfaces etc), the ones also used by Direct3D.

– Maxim S. Shatskih
Microsoft MVP on File System And Storage

xxxxx@storagecraft.com

http://www.storagecraft.com

Maxim_S_Shatskih · December 15, 2014, 12:34am

> Then , second test , fill rectangle , and , draw random string into rectangle . Now CPU usage very

high , one core 100 % .

Great tests! These ones really provide some understanding on where the perf issue is.

Use Memory DC ?

Yes, try this. I expect the Memory DC go the same way as in-memory D3D surface.

So, for each cell, keep the current picture of its text as a bitmap, which is only discarded/regenerated if the text was updated.

–
Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

Maxim_S_Shatskih · December 15, 2014, 12:36am

>Blitting a large rectangle isn’t going to be any quicker than drawing text strings.

In his tests, filling the rect is by far faster then TextOut.

So, blitting is also probably fast. Blitting from the memory DC (another surface in the VRAM) can be lightning fast.

–
Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

Maxim_S_Shatskih · December 15, 2014, 12:38am

>TextureBrush blit faster than GDI+ draw string ?

GDI+ can be rather slow, due to special effects it supports.

You can try go the most low-level with the call of TextOut.

Also don’t forget about font anti-aliasing (Smooth Edges of the Screen Fonts) which can eat lots of time compared to usual TextOut.

–
Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

thunder_zhang · December 15, 2014, 1:42am

Hi , Shatksih ,

Do not mean to offend .

Grid has real time data source . Not database , bound data source etc . Actual , user don’t scroll much at all . Screen size large enough to display most useful data . Also multiple screen .

Agree user cannot see content of many cell at fast update rate . However , also have feature : conditional paint , background cell color , highlight cells , etc . Pattern of highlight across grid etc . very important to user , during fast update . Give feed back more than actual cell value .

Seems , must implement own grid , not use .net dgv. But , many codes to implement : bitmap cache , accept user input , resize column , reorder column , … boss wants quick fix !!

thunder_zhang · December 15, 2014, 1:39pm

Hi , M M ,

App is multi - threaded . Data come from 4 - 5 sources , every source run on own thread . Thread code put data in backing store , then call UpdateCellValue() . But , how to use existing grid “multi thread” ? Have to call UpdateCellValue() , then grid ask for cell value later in CellValueNeeded , called from " GUI thread " . This is bottle neck , I think . Do you mean , run every grid as separate process ? Use IPC to control ?

I don’t get your meaning , " keep track of the validity yourself " . Do you mean , know which cells need repaint ? How is different from know when to call UpdateCellValue ?

I thought , test showed , fill blank rect or fill rect with bitmap , fast , but write cell string with GDI+ , slow . Thus , propose cache cell bitmaps . Not needed ? Shatskih think , yes.

Maxim_S_Shatskih · December 15, 2014, 11:36pm

>When using a DGV, each cell is a sub-control and each one is validated and painted on its own.

How bad.

Any grids in WPF/SilverLight?

worker thread that receives the update. After an update had been processed, use
WM_INVALIDATE

InvalidateRect/Rng APIs

–
Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

Tim_Roberts · December 16, 2014, 12:13pm

Marion Bond wrote:

After an update had been processed, use WM_INVALIDATE or the
equivalent .NET call to force a re-paint. Importantly, donâ€™t trigger
another paint until the last one is complete (plus any delay for a
human to read it).

To a certain extent, that happens automatically. The message loop won’t
even check for another WM_PAINT until it has finished processing the
current one. And if 50 WM_PAINTs come in while the loop is busy, only
one gets registered.

–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.