Video Latency
Posted: 02 February 2010 09:31 PM   [ Ignore ]
Jr. Member
RankRank
Total Posts:  49
Joined  2010-01-15

I’m using my internally written tool that reports capturing/rendering latency and I’m experiencing strange statistics that I want you Alex to explain.
As you are claiming (and I personally feel that PS3Eye is extremely low latency camera), I think below statistics might interest you and others. However, I want you to explain some strange degradation in latency at high framerates (look below, above 50fps):

640*480 @15fps:  Latency: ~130ms or around 2frames
640*480 @30fps:  Latency: ~66ms or around 2frames
640*480 @40fps:  Latency: ~75ms or around 3frames
640*480 @50fps:  Latency: ~60ms or around 3frames   <== this is the best
640*480 @60fps:  Latency: ~116ms or around 7frames
640*480 @75fps:  Latency: ~116ms or around 8-9frames

The test is performed using Direct3D rendering to 75hz monitor.
So, the question is: why up to 50fps latency is just around 2-3frames, while 60 and 75 somehow latency increases to 7-9 frames?

Profile
 
 
Posted: 02 February 2010 11:01 PM   [ Ignore ]   [ # 1 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  585
Joined  2009-09-17
igor1960 - 02 February 2010 09:31 PM
I’m using my internally written tool that reports capturing/rendering latency and I’m experiencing strange statistics that I want you Alex to explain. As you are claiming (and I personally feel that PS3Eye is extremely low latency camera), I think below statistics might interest you and others. However, I want you to explain some strange degradation in latency at high framerates (look below, above 50fps): 640*480 @15fps:  Latency: ~130ms or around 2frames 640*480 @30fps:  Latency: ~66ms or around 2frames 640*480 @40fps:  Latency: ~75ms or around 3frames 640*480 @50fps:  Latency: ~60ms or around 3frames   <== this is the best 640*480 @60fps: Latency: ~116ms or around 7frames 640*480 @75fps: Latency: ~116ms or around 8-9frames The test is performed using Direct3D rendering to 75hz monitor. So, the question is: why up to 50fps latency is just around 2-3frames, while 60 and 75 somehow latency increases to 7-9 frames?
Are you using SDK or DirectShow? Who are you measuring the capture latency?
Profile
 
 
Posted: 02 February 2010 11:10 PM   [ Ignore ]   [ # 2 ]
Jr. Member
RankRank
Total Posts:  49
Joined  2010-01-15

Alex,

I’m talking about DirectShow and using my internal tool.
Basically what it does is sends to display black frame remembers time that frame has been sent, while the camera is facing that display.
After that it starts looking into each frame and finds first frame with that black frame captured and received.
So, the difference in time between receiving black frame and sending black frame is exactly full latency including not just capturing, but full directshow processing and rendering.
I’m just looking into some explanation of that strange behavior that might come to your mind.

Profile
 
 
Posted: 02 February 2010 11:36 PM   [ Ignore ]   [ # 3 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  585
Joined  2009-09-17
igor1960 - 02 February 2010 11:10 PM

Alex,

I’m talking about DirectShow and using my internal tool.
Basically what it does is sends to display black frame remembers time that frame has been sent, while the camera is facing that display.
After that it starts looking into each frame and finds first frame with that black frame captured and received.
So, the difference in time between receiving black frame and sending black frame is exactly full latency including not just capturing, but full directshow processing and rendering.
I’m just looking into some explanation of that strange behavior that might come to your mind.

DirectShow introduces latency on its own and its not meant for real-time processing. On top of this did you account for the latency of your display (monitor) and latency of your graphics card? You cannot assume that throughout the listed frame rates this rendering/display latency is constant. How did you measure that?
The more realistic test to measure the camera latency would be to light a bar of LEDs (in hardware) and capture them with the camera. This way you eliminate the whole rendering/graphics card/display (unknown) latency. Remember that windows is not a real-time OS, far from that. What was the CPU usage at those high frame-rates? Maybe you started dropping frames.

For any kind of high performance capture and minimal latency you should be using the CL-Eye SDK. As opposed to DirectShow, the data path from the camera to your buffer in the CL-Eye SDK is well known and optimized. Therefore you will get minimal latency using this API.

AlexP

Profile
 
 
Posted: 03 February 2010 01:37 AM   [ Ignore ]   [ # 4 ]
Jr. Member
RankRank
Total Posts:  49
Joined  2010-01-15

Alex,  I’m pretty experienced with DirectX/Direct3D, so I pretty much know all inefficiencies and etc.
The purpose of latency test using DirectX is exactly to see and help you, as from what I understood that is your main targeted client.
As to, do I need to avoid rendering or not: that is program specific feature, as some program require full latency calculation including rendering. In fact. most program excluding just image processing require rendering and latency is most important for such cases.

So, let me then be more specific: even with your proposed solution with LEDs and without DirectX: did you measure and can you publish some latency figures? What I meen here: assuming the target has changed, is it exactly just 1 frame required to fully reflect change of the target image or is it more?
I understand that theoretically that should be just 1 frame. But practically, there might be some kind of buffering on both sides of USB transport, so maybe there is more then 1 frame in the buffer and therefore target change will reach the host after more then 1 frame.
If there is some buffering, maybe size of the buffer is the function of frame rate and then this would explain why on higher FPS I’m getting higher latency.

This was my question.

Profile
 
 
Posted: 03 February 2010 06:43 PM   [ Ignore ]   [ # 5 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  585
Joined  2009-09-17
igor1960 - 03 February 2010 01:37 AM

Alex,  I’m pretty experienced with DirectX/Direct3D, so I pretty much know all inefficiencies and etc.
The purpose of latency test using DirectX is exactly to see and help you, as from what I understood that is your main targeted client.
As to, do I need to avoid rendering or not: that is program specific feature, as some program require full latency calculation including rendering. In fact. most program excluding just image processing require rendering and latency is most important for such cases.

So, let me then be more specific: even with your proposed solution with LEDs and without DirectX: did you measure and can you publish some latency figures? What I meen here: assuming the target has changed, is it exactly just 1 frame required to fully reflect change of the target image or is it more?
I understand that theoretically that should be just 1 frame. But practically, there might be some kind of buffering on both sides of USB transport, so maybe there is more then 1 frame in the buffer and therefore target change will reach the host after more then 1 frame.
If there is some buffering, maybe size of the buffer is the function of frame rate and then this would explain why on higher FPS I’m getting higher latency.

This was my question.

The reason why I’m asking for a purpose of your test is that based on your setup you cannot conclude anything with any kind of confidence. Meaning you are measuring the total latency of the capture-display loop. This might be useful for some real-time UI/Game interaction via multitouch for example. But again it doesn’t give you any hard numbers on the performance of each sub-system.
I would rather like to see the isolated latency measurements of both capture and display and then their combined latency. This way we can exactly pinpoint what latency time changes.

The main client for this camera is not DirectShow, it is in fact the API provided by the SDK. The DirectShow is mostly meant for end users who are not so much interested in high performance but the ability to use this great camera with various web chat/video capture programs.

The internal CL-Eye SDK API does not introduce any time/fps varying buffering mechanisms. It will always give you the last frame captured. Due to the nature of the camera design, there is no internal buffer on board and any data captured is transmitted immediately and passed to the user buffer upon complete receipt of the video frame. Exactly because of this the camera might be very unstable under high CPU usage conditions, since any data lost will result in camera image being corrupted/dropped.

From my tests (with CL-Eye SDK) the time it takes for frame data to propagate through the image conversion and lens distortion algorithms is about 2ms (tested on I7 920 CPU, Vista Ultimate x64). So in the worst case you will have the latency of the time to capture the image + 2ms. Now of course, on the slower CPU this time will vary and if the CPU cannot keep up, you will see the increased latency time (something like you shown in your results). Due to the CPU being busy, it will take longer to do other tasks involved (rendering, display, etc) so this timing will not increase linearly with the frame rate but it will be more complex function of fps.

I would be really interested to see the results of the same test using the SDK for capture and your custom DX renderer for the display.

AlexP

Profile
 
 
Posted: 03 February 2010 08:06 PM   [ Ignore ]   [ # 6 ]
Jr. Member
RankRank
Total Posts:  49
Joined  2010-01-15

Alex,

Thanks for answering: so there is no internal buffers whatsoever and in DirectX source capturing too?

As, to measuring separately DirectShow delays: I can give you numbers.
I’m testing this on 75hz LCD monitor with 200hz backlight, therefore I know precisely time of monitor vertical sync and I’m measuring delay of rending too. So, I’ll extend my previous data with the data that includes rendering delays:

640*480 @15fps:  Latency: ~130ms or around 2frames:  rendering delay ~8ms;  capture/transport delay ~122ms
640*480 @30fps:  Latency: ~66ms or around 2frames;  rendering delay ~7ms;  capture/transport delay ~59ms
640*480 @40fps:  Latency: ~52ms or around 3frames;  rendering delay ~6.5ms;  capture/transport delay ~45.5ms <= this is the best
640*480 @50fps:  Latency: ~60ms or around 3frames;  rendering delay ~10ms;  capture/transport delay ~50ms
640*480 @60fps:  Latency: ~96ms or around 6frames;  rendering delay ~26ms;  capture/transport delay ~70ms
640*480 @75fps:  Latency: ~96ms or around 7frames;  rendering delay ~32ms;  capture/transport delay ~68ms

As to testing CL-Eye SDK without DirectShow—I might consider writing small test when I have time…

Profile
 
 
Posted: 03 February 2010 09:13 PM   [ Ignore ]   [ # 7 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  585
Joined  2009-09-17
igor1960 - 03 February 2010 08:06 PM
Alex, Thanks for answering: so there is no internal buffers whatsoever and in DirectX source capturing too? As, to measuring separately DirectShow delays: I can give you numbers. I’m testing this on 75hz LCD monitor with 200hz backlight, therefore I know precisely time of monitor vertical sync and I’m measuring delay of rending too. So, I’ll extend my previous data with the data that includes rendering delays: 640*480 @15fps:  Latency: ~130ms or around 2frames:  rendering delay ~8ms;  capture/transport delay ~122ms 640*480 @30fps:  Latency: ~66ms or around 2frames;  rendering delay ~7ms;  capture/transport delay ~59ms 640*480 @40fps:  Latency: ~52ms or around 3frames;  rendering delay ~6.5ms;  capture/transport delay ~45.5ms <= this is the best 640*480 @50fps: Latency: ~60ms or around 3frames; rendering delay ~10ms; capture/transport delay ~50ms 640*480 @60fps: Latency: ~96ms or around 6frames; rendering delay ~26ms; capture/transport delay ~70ms 640*480 @75fps: Latency: ~96ms or around 7frames; rendering delay ~32ms; capture/transport delay ~68ms As to testing CL-Eye SDK without DirectShow -- I might consider writing small test when I have time…
How are you exactly measuring the rendering delay? Do you take into consideration video card double/triple buffering latency? What I’m talking about here is precisely how much time does it take from your call to render the buffer until it is actually get displayed on the screen. Are you doing this in exclusive mode? DirectShow capture does not use any internal buffering and the buffer from the camera is passed directly to the DS buffer. Of course the time that this gets called is another thing. AlexP
Profile
 
 
Posted: 03 February 2010 10:29 PM   [ Ignore ]   [ # 8 ]
Jr. Member
RankRank
Total Posts:  49
Joined  2010-01-15

Independent of full screen or windowed mode if you created Direct3D device with PresentationInterval in d3d presentation parameters
equal to D3DPRESENT_INTERVAL_ONE, you will get exactly wait for next vertical sync when calling Present.
Alternatively, D3DPRESENT_INTERVAL_IMMEDIATE will not wait for vertical sync at all.
So, what I’m doing is just calculating time spent in direct show graph between setting black buffer and/or testing next captured buffer and exit from Present with either D3DPRESENT_INTERVAL_ONE or D3DPRESENT_INTERVAL_IMMEDIATE.
Theoretically, you might be right and there might be extra hardware delay between Present and actual reflection on monitor, however that’s behavior is not what exactly Direct3D documentation states.
However, even if there is such a delay, I would assume this will still not explain strange behavior I’m talking about: if there is no buffering at all, why then it takes more time to capture with higher FPS?

Profile
 
 
Posted: 04 February 2010 01:44 AM   [ Ignore ]   [ # 9 ]
Administrator
Avatar
RankRankRankRank
Total Posts:  585
Joined  2009-09-17
igor1960 - 03 February 2010 10:29 PM

Independent of full screen or windowed mode if you created Direct3D device with PresentationInterval in d3d presentation parameters
equal to D3DPRESENT_INTERVAL_ONE, you will get exactly wait for next vertical sync when calling Present.
Alternatively, D3DPRESENT_INTERVAL_IMMEDIATE will not wait for vertical sync at all.
So, what I’m doing is just calculating time spent in direct show graph between setting black buffer and/or testing next captured buffer and exit from Present with either D3DPRESENT_INTERVAL_ONE or D3DPRESENT_INTERVAL_IMMEDIATE.
Theoretically, you might be right and there might be extra hardware delay between Present and actual reflection on monitor, however that’s behavior is not what exactly Direct3D documentation states.
However, even if there is such a delay, I would assume this will still not explain strange behavior I’m talking about: if there is no buffering at all, why then it takes more time to capture with higher FPS?

Could you list the CPU usage along with your test results?

Profile
 
 
Posted: 04 February 2010 02:19 AM   [ Ignore ]   [ # 10 ]
Jr. Member
RankRank
Total Posts:  49
Joined  2010-01-15

640*480 @15fps:  6%CPU   Latency: ~130ms or around 2frames:  rendering delay ~8ms;  capture/transport delay ~122ms
640*480 @30fps:  11%CPU Latency: ~66ms or around 2frames;  rendering delay ~7ms;  capture/transport delay ~59ms
640*480 @40fps:  16% CPU Latency: ~52ms or around 3frames;  rendering delay ~6.5ms;  capture/transport delay ~45.5ms
640*480 @50fps:  19% CPU Latency: ~60ms or around 3frames;  rendering delay ~10ms;  capture/transport delay ~50ms
640*480 @60fps:  30% CPU Latency: ~96ms or around 6frames;  rendering delay ~26ms;  capture/transport delay ~70ms
640*480 @75fps:  50% CPU Latency: ~96ms or around 7frames;  rendering delay ~32ms;  capture/transport delay ~68ms

Somehow huge jump in CPU usage from 60FPS to 75FPS. But this I’ve already reported in one of previous threads with CL SDK…
Also, jumps in CPU usage for 60 and 75FPS could be also explained: as latency grows to 6-7 frames, my algorithm itself spends more time in checking blank frame arrival…

Profile
 
 
1 of 3
1
 


RSS 2.0     Atom Feed