Jun 30, 2009

projection in vertex shader

一般的perspective projection :

在 vertex shader 動手腳的 projection:

不知道為什麼下圖有點走樣,不過似乎可以靠vertex shader 讓 z 值變成線性的,這樣的話應該可以避掉z fighting,效能差多少就不清楚了。

deferred lighting

In the beginning, I wanted to know how "light pre-pass" works. But I couldn`t catch that idea easily. Then I started to study defered lighting. Finally, I make a sample after wasting whole GPU powers. (yes, I implement it without any optimazation !) Here is a screenshot, 128 point lights rotate around each balls. It looks like the reflection of wave can be implemented by the same way.

Jun 29, 2009




DXGI_FORMAT_R8G8B8A8_UNORM 才是我要的結果!dx 10 DXUT 的back buffer format 預設只選DXGI_FORMAT_R8G8B8A8_UNORM_SRGB!

Jun 22, 2009

when to SSE ?

Long long ago, when I playing with Jaina, I don`t know why my Jaina can`t act as smooth as Blizzard`s. But there is a rumor about how to make Jaina move more like a really young girl. SSE is the first solution.

I don`t find any document about cost of SSE (ohh...I`m lzay, you know that...). But in my experience, there is a simple rule. The most general usage of SSE is matrix multiplication. And there are many many multiplication in bone skin animation. But you`ll find it cost more CPU power if you only write SSE to do "one" vector multiple "one" matrix.

The simple rule is :
if the number of vector multiplication is more than double of vector IO, SSE will gain higher performance.

For example, the dot value of 2 vectors need 3 IO (2 vectors in, one value out), but there is only 1 vector multiplication. Another example is vector multiple matrix, 6 IO (5 in, 1 out) with 4 vector mul ...... so we still can`t get better performance.

But in the bone animation case, there are less bones with many many point. That means many points will mul the same matrix in each frame. If you have to mul N points, you need :

  1. 4 vector reading from matrix.
  2. N vector reading from points.
  3. N vector writing to points.
  4. each point need 4 vector mul.

Ignore the 1`st one, it just meet my simple rule. So there is a chance make my Jaina act more smooth. (Just write a function to mul many vectors to one matrix).

BTW, there are many 0 in normal 3d matrix......that`s another story.

Jun 21, 2009



Jun 20, 2009


中間的棚子有 LOD ……就在這麼近的距離下,玩game 的時候都在注意這些會不會太無趣了 XD









上圖,我穿不過一個在火盆與旗子之間的空隙,因為bounding box 卡到火盆了,完全無法前進。

Jun 19, 2009



塩野七生.《羅馬人的故事 IX》

Jun 17, 2009


最近玩戰鎚,不過大概快玩不下去了,bug很多,而且G1S跑起來也不順,3G的RAM加上8600GT還不能順暢的運作,這個game一定有些問題,雖然設定上還有些意思,但我從第一天開始就覺得:無論美術程式,都無法與Blizzard 打對台。今天累了,提一個奇怪的rendering bug就好,其他有空再說。

圖不是很清楚,左上角紅框裡還有個「方框」,喔不!這是個誤會,實際上那是盞燈,週遭則是充滿煙霧(風沙?)的場景,會跑出個這麼突兀的方框是 z check 跟 z write 的問題,燈看起來是張textue (texture animation?maybe)的billboard,屬於半透明物件,整片煙霧也是半透明物件,這兩個東西都會等到整個場景畫得差不多後才會畫上去,為得是半透明的渲染未經排序的話結果會是錯的。我猜問題是這樣發生的:燈影跟煙霧畫的順序不一定,有時燈先畫,有時煙霧先,所以這個問題不是一直存在。接著,畫燈影的時候 z write 是開著的,畫霧的時候 z check 是開著的,結果燈先畫的話,因為 z write(z 值還可能是錯的),導致畫煙霧時 z check 後沒辦法畫那一塊,最後就生出一個方框了。


why UnAdvise ?

If you are familier with COM, you would usually work with AdviseFooEventSink / UnAdviseFooEventSink. You have to call UnAdviseFooEventSink explicitly somewhere ...... it`s better not in the scope of this event sink, especially in the destructor of event object. But WHY ?

I try to demo something by directshow recently. When working with filter, I made some mistake in accident today. I did something like UnAdviseFooEventSink in the destrucotr of event object. Let`s check what happen :

  1. when the event sink object is created, the reference count should be 1.
  2. when Foo Advise the sink object, reference count of sink object increase to 2.
  3. when I don`t need the sink object anymore and try to release it, reference count decrease to 1.
  4. Now the owner of the sink object is Foo, and I wish sink object to be UnAdvise when being deleted.
  5. Since Foo is still the owner of sink object, sink object won`t be deleted if Foo doesn`t do extra work.

Thanks for the debugging function of baseclass of dshow, I got some ugly assert and try to fix it.

BTW, the same isuue would happen in cocoa of Mac, too. If 2 objects are the owner of each other (retain each other), they should be disconnected before release the final reference outside of the scope.

Jun 1, 2009

some methods to optimized gui rendering.

I thought about gui rendering today and got some ideas. Here is a brief note (but this should be useless since ui rendering is an old topic).

Sometimes we have to render ui with 3d API such as d3d. For example, it`s hard to rendering ui with video without d3d on windows. The generic method to rendering gui is to render many "rectangle", like buttons, checkbox, combobox, etc. Each rectangle need one draw call. When there are too many controls (include text), it may call draw primitive too much times and start to affect the performance. There are some points we can try :

[1]. Sort entire gui rendering by texture. The first issue make us render rectangle one by one is : all images may not on the same texture. For controls that share the same texture we have a chance to draw many "triangle list" to reduce the draw call.

[2]. Separate gui to dynamic & static part. All controls share the same "dynamic" vertex buffer. For dynamic controls, changing vertex position to move their position. For static controls, their position is fixed. So all vertices in the buffer can sort by their state to 2 part, too. Then there is a chance to minimize the vertex update method.

[3]. All controls share the same "dynamic" index buffer. For those hided control, we can just ignore them by not applying index. Sort the index could get better performance, too.

[4]. If there is alpha controls, separate them to another pass......or everything is gone.

[5]. So now you can :
  1. get a single texture with all images of controls.
  2. update position of dynamic controls by update vertex buffer.
  3. update visibility of controls by update index buffer.
  4. collect all characters in the same texture (or less textures)
  5. pray.
[6]. I am kidding... please inform me if you tried it.