Forum Index | FAQ | New User | Login | Search

Make a New PostPrevious ThreadView All ThreadsNext Thread*Show in Threaded Mode


SubjectTile Renderers new Reply to this message
Posted byfinaldave
Posted on07/26/04 10:21 AM



I'm working on Giga/PicoDrive, a Megadrive emulator, a bit at the moment. When I did a few measurements a few weeks ago I found 4ms was used on cpu, and 20ms used on graphics rendering on a GP32.

So I thought I may take a bit of time to jump up and down on the tile rendering and make it a bit more efficent. It's a line by line renderer at the moment, which I'm keen to hold onto for at long as possible because it makes raster effects so easy. But I may have to lose that at some point or at least make two graphics engines.

But I'm curious to see how fast a line by line tile renderer can go though.

Megadrive has 2 scroll planes A and B, with 4-bit 8x8 tiles from 4 palettes. Colour 0 is transparent. A tile can be selected high or low priority by a bit and can be flips x and y. All stuff I'm sure you know, but just to remind you ;-)


The first optimisation I have is the blank tile one - it remembers the 16-bit tile code which is used for a blank tile, and doesn't bother drawing it at all.

Reference piccy from Wiki:
Just looking at a typical screenshot from Sonic1, you could save a lot of time if you could avoid plotting the background tiles which were hidden. (But is it worth the overhead of calculating which are invisble?)
Also if you could plot the repeating tiles by just copying over the original data? etc.


My question is, there must be a lot of ideas and things people have tried in the past since there are so many emulators with tile rendering - what optimisations do you think worked well?


You learn something old everyday...



SubjectRe: Tile Renderers new Reply to this message
Posted byBart T.
Posted on07/27/04 00:38 AM



>But I may have to lose that at some point or at least make two
> graphics engines.

Are you sure a tile engine would be noticably faster? It seems like you'd be trading off accuracy and a very slight per-line performance hit for a worse user experience (having to choose 2 engines -- unless you intend to make it switch automatically and transparently) and less accurate emulation with a barely noticable speed improvement.

In the end, you're still rendering the same amount of pixels. Can you identify if there's a bottleneck in line rendering?

I tried implementing a tile renderer for Genital once and it turned out to be remarkably similar to the line engine, only it couldn't handle line scrolling.

> The first optimisation I have is the blank tile one - it remembers the 16-bit
> tile code which is used for a blank tile, and doesn't bother drawing it at all.

This could be a good one. Try implementing it and see how much of a boost it gives. Perhaps you could even do something awesome like self-modifying code, assuming the target devices don't have any cache (I don't know much about ARM.)

> Reference piccy from Wiki:
> Just looking at a typical screenshot from Sonic1, you could save a lot of time
> if you could avoid plotting the background tiles which were hidden. (But is it
> worth the overhead of calculating which are invisble?)

The overhead of figuring out which tile is invisible would be minimal. It would be done only during VRAM writes.

> My question is, there must be a lot of ideas and things people have tried in the
> past since there are so many emulators with tile rendering - what optimisations
> do you think worked well?

Charles MacDonald has a really accurate renderer in Genesis Plus, IIRC. Several years ago he sent me a rough draft of a document which described how Genecyst and Genesis Plus (then MekaDrive) rendered front-to-back quickly (which is required for accuracy.) I wish I had it still (I may have a printed copy somewhere in my closet.) If you're not already doing accurate front-to-back rendering, it would have been an interesting read. Genecyst encoded priority information into the unused bits of each 8-bit pixel (since only 6 bits were needed for color.) Consequently, this is why shadow/highlight mode was never implemented. MekaDrive used a similar technique but supported shadow/highlight accurately. I think it may have had 2 separate engines it switched between at run-time.

If your target processor speeds are really low, you have no cache, and have enough memory, look-up tables may be beneficial. There are three kinds of tiles: Fully transparent (probably only 1), partially transparent, and fully opaque. If a sizable portion of the screen is covered by fully opaque tiles, it could be worthwhile to write code that detects them and then plots them without doing a pixel-for-pixel transparency check. Maybe you could even figure this data out for each tile line rather than just the entire tile.

On low-speed processors, hand tweaked assembly is the way to go :) At least for rasterizing.


----
Bart


SubjectRe: Tile Renderers new Reply to this message
Posted byJan_Klaassen
Posted on07/27/04 09:58 AM



> I'm working on Giga/PicoDrive, a Megadrive emulator, a bit at the moment. When I
> did a few measurements a few weeks ago I found 4ms was used on cpu, and 20ms
> used on graphics rendering on a GP32.
>
> So I thought I may take a bit of time to jump up and down on the tile rendering
> and make it a bit more efficent. It's a line by line renderer at the moment,
> which I'm keen to hold onto for at long as possible because it makes raster
> effects so easy. But I may have to lose that at some point or at least make two
> graphics engines.

Why? You could simply use a normal tile engine and add some stuff to make clipping efficient, e.g. only parse the needed parts of the tilemap.





SubjectRe: Tile Renderers new Reply to this message
Posted byHyde
Posted on07/27/04 04:04 PM



> Megadrive has 2 scroll planes A and B, with 4-bit 8x8 tiles from 4 palettes.
> Colour 0 is transparent. A tile can be selected high or low priority by a bit
> and can be flips x and y. All stuff I'm sure you know, but just to remind you
> ;-)

Is the character (pattern) data packed like graphics in the NES (i.e. one byte holds the lsbits of a tileline and other bytes hold other bits)? If so, you might want to consider unpacking the tile data into something more accessible (I got a 10% speed boost in my NES emulator by doing so). I managed to do this while still using the same amount of memory.




SubjectRe: Tile Renderers new Reply to this message
Posted byfinaldave
Posted on07/27/04 04:22 PM



> > I'm working on Giga/PicoDrive, a Megadrive emulator, a bit at the moment. When
> I
> > did a few measurements a few weeks ago I found 4ms was used on cpu, and 20ms
> > used on graphics rendering on a GP32.
> >
> > So I thought I may take a bit of time to jump up and down on the tile
> rendering
> > and make it a bit more efficent. It's a line by line renderer at the moment,
> > which I'm keen to hold onto for at long as possible because it makes raster
> > effects so easy. But I may have to lose that at some point or at least make
> two
> > graphics engines.
>
> Why? You could simply use a normal tile engine and add some stuff to make
> clipping efficient, e.g. only parse the needed parts of the tilemap.
>

The Megadrive has per line offset... I'm not sure how this effects the choice of a line-based renderer and a tile-based renderer... I remember have a bit of a nightmare in Final Burn with a tile based one!

I'm interested by Bart's comment that a line-based renderer may not be too much slower than a tile-based one after all - and I think that front-to-back rendering and other optmisations may well be easier on a line by line basis.

Had anyone actually compared a line-based and tile-based engine and the performance difference?

If it is minimal, is there any great advantage to tile based engines at all??


You learn something old everyday...



SubjectRe: Tile Renderers new Reply to this message
Posted byfinaldave
Posted on07/27/04 04:27 PM



> > Megadrive has 2 scroll planes A and B, with 4-bit 8x8 tiles from 4 palettes.
> > Colour 0 is transparent. A tile can be selected high or low priority by a bit
> > and can be flips x and y. All stuff I'm sure you know, but just to remind you
> > ;-)
>
> Is the character (pattern) data packed like graphics in the NES (i.e. one byte
> holds the lsbits of a tileline and other bytes hold other bits)?
4-bytes holds 8 pixels like this:

static int TileNorm(unsigned char *pd,int addr,int pal)
{
unsigned int pack=0; int t=0;

pack=*(unsigned int *)(Pico.vram+addr); // Get 8 pixels
if (pack==0) return 1;

t=pack&0xf; pack>>=4; if (t) pd[3]=(unsigned char)(pal|t);
t=pack&0xf; pack>>=4; if (t) pd[2]=(unsigned char)(pal|t);
t=pack&0xf; pack>>=4; if (t) pd[1]=(unsigned char)(pal|t);
t=pack&0xf; pack>>=4; if (t) pd[0]=(unsigned char)(pal|t);
t=pack&0xf; pack>>=4; if (t) pd[7]=(unsigned char)(pal|t);
t=pack&0xf; pack>>=4; if (t) pd[6]=(unsigned char)(pal|t);
t=pack&0xf; pack>>=4; if (t) pd[5]=(unsigned char)(pal|t);
t=pack ; if (t) pd[4]=(unsigned char)(pal|t);

return 0;
}


Note the funny order is just because ARM is little endian, the nibbles are really in order on the Megadrive


> If so, you
> might want to consider unpacking the tile data into something more accessible (I
> got a 10% speed boost in my NES emulator by doing so). I managed to do this
> while still using the same amount of memory.
>

I'm not sure really - one 32-bit value does seem quite nice for throwing around.


You learn something old everyday...



SubjectRe: Tile Renderers new Reply to this message
Posted byerikduijs
Posted on07/28/04 09:46 AM



I was thinking if you do line-by-line front-to-back rendering, you could quite easily avoid drawing the invisible background.
Just a thought...




SubjectRe: Tile Renderers new Reply to this message
Posted byJan_Klaassen
Posted on07/28/04 11:00 AM



> > Why? You could simply use a normal tile engine and add some stuff to make
> > clipping efficient, e.g. only parse the needed parts of the tilemap.
>
> The Megadrive has per line offset... I'm not sure how this effects the choice of
> a line-based renderer and a tile-based renderer... I remember have a bit of a
> nightmare in Final Burn with a tile based one!

It shouldn't, it's not hard to add to either.

> Had anyone actually compared a line-based and tile-based engine and the
> performance difference?
>
> If it is minimal, is there any great advantage to tile based engines at all??

Yes. For some systems it's handy to queue tiles based on priority before rendering them (e.g. systems with per-tile priorities). If you have a system that has that and masking effects when sprite-sprite priority doesn't match sprite-tile priority, it lets you optimise things more, since you already know which tile priorities you can skip. Ofcourse the sprite bit implies you're rendering everything back-to-front to one buffer and use a z-buffer to get the masking effects. This is what makes of the Cave games run full-speed on a PII-266 in FBA (except Guwange, which has a lot of out of sequence sprites). Note that FBA's cave tile renderer doesn't use queues for row-scroll/row-select modes, since those aren't used in-game, and my lazyness means I didn't optimise the rendering more than needed.




SubjectRe: Tile Renderers new Reply to this message
Posted byfinaldave
Posted on07/28/04 05:24 PM



> I was thinking if you do line-by-line front-to-back rendering, you could quite
> easily avoid drawing the invisible background.
> Just a thought...
>

Yes i think that was the main thing i was thinking as well, I just didn't realise it had a name "front-to-back" ;-)

So basically, for each line:

- calculate the tile values for Plane A
- Render Plane A high tiles
- calculate the tile values for Plane B which are still visible
- Render Plane B high tiles clipping to uncovered pixels
- Render Plane A low tiles clipping to uncovered pixels
- Render Plane B low tiles clipping to uncovered pixels

Making sure to skip completely blank tiles

Do you reckon that's pretty much as fast as it gets?







You learn something old everyday...



SubjectRe: Tile Renderers new Reply to this message
Posted byBart T.
Posted on07/28/04 09:10 PM



> Yes i think that was the main thing i was thinking as well, I just didn't
> realise it had a name "front-to-back" ;-)

Front-to-back rendering is _not_ an optimization. It's actually quite a bit slower because for each pixel, you need to check to make sure there's nothing in the frame buffer with a higher priority. It's almost like Z-buffering except that you don't necessarily need a separate occlusion buffer (though I think Stef mentioned Gens uses one.)

Genecyst encoded priority information into the upper 2 bits of each pixel and it used that to determine whether something was already there (because obviously the color data alone tells you nothing.)

Front-to-back rendering is required for complete accuracy, though. If you're comfortable with sacrificing that, I'd stick to a back-to-front method because even though there's going to be overdraw, you don't have to worry about reading the frame buffer contents at each pixel.


----
Bart


SubjectRe: Tile Renderers new Reply to this message
Posted byfinaldave
Posted on07/29/04 07:43 AM



> > Yes i think that was the main thing i was thinking as well, I just didn't
> > realise it had a name "front-to-back" ;-)
>
> Front-to-back rendering is _not_ an optimization. It's actually quite a bit
> slower because for each pixel, you need to check to make sure there's nothing in
> the frame buffer with a higher priority. It's almost like Z-buffering except
> that you don't necessarily need a separate occlusion buffer (though I think Stef
> mentioned Gens uses one.)
>
> Genecyst encoded priority information into the upper 2 bits of each pixel and it
> used that to determine whether something was already there (because obviously
> the color data alone tells you nothing.)
>
> Front-to-back rendering is required for complete accuracy, though.


How come - do you mean like the intro to Sonic 2 with the low priority sprite masking?


You learn something old everyday...



SubjectRe: Tile Renderers new Reply to this message
Posted byBart T.
Posted on07/29/04 10:39 AM



> How come - do you mean like the intro to Sonic 2 with the low priority sprite
> masking?

I don't know about Sonic 2 but IIRC there is a situation that arises with sprite limiting. If the sprites are drawn in the backwards order, the wrong sprites will be cut off once you hit the draw limit. Front to back is the right rendering order on the actual system so it works.

Also, there's a more tricky situation. I hope my memory serves and I can explain it correctly:

Sprites have either high/low priority and this indicates how they are to be drawn with regards to the Scroll planes. Low priority sprites appear above Scroll B but under Scroll A (I think) and high priority sprites appear over Scroll A. But, these priorities have nothing to do with how sprites overlap themselves. The first sprite is supposed to be drawn on top of all the other ones.

This leads to an interesting situation. Assume the higher numbered sprite appears in the list before the other one:

Scroll B -> Low pri sprite #0 -> Scroll A -> Hi pri sprite #1

A back-to-front Genesis renderer would draw Scroll B, then sprite #0 (because it's low priority), then Scroll A, then sprite #1. But sprite #0 appears in the sprite list before #1. I think what happens is, wherever sprite #1 and #0 overlap, #0 actually shows through on top of sprite #1.

I might have gotten this wrong. It's highly dependent on how the Genesis actually draws the sprites but I clearly remember Charles describing a situation like this being used by some games. You may want to investigate Sonic 1's Marble Zone stage. When you step on the platforms in the lavas, the fireballs should appear on top, I believe (compare it with Gens and/or Genecyst to be sure.)

In a nutshell, the problem is that the sprite priority flag describes how sprites are layered with respect to scroll planes but with regards to each other, it's the order they appear in the list that matters. Hope that made some sense.


----
Bart


SubjectRe: Tile Renderers new Reply to this message
Posted byJan_Klaassen
Posted on07/29/04 11:32 AM



> In a nutshell, the problem is that the sprite priority flag describes how
> sprites are layered with respect to scroll planes but with regards to each
> other, it's the order they appear in the list that matters. Hope that made some
> sense.

Something similar happens in a lot other hardware as well, and you can solve it by using a z-buffer for just the sprites (optimise it so it only kicks in when needed and it will usually not impact performance much at all).





SubjectRe: Tile Renderers new Reply to this message
Posted byBart T.
Posted on07/29/04 02:54 PM



> Something similar happens in a lot other hardware as well, and you can solve it
> by using a z-buffer for just the sprites (optimise it so it only kicks in when
> needed and it will usually not impact performance much at all).

But it's not just a case of sprites obstructing other sprites or causing them not to get drawn. They actually appear over sprites that have a higher scroll-plane priority. It seems like z-buffering would need to take into account the plane tiles as well.

I really should investigate the Genesis again to see what actually happens. I think Charles has some documentation on his site.



SubjectRe: Tile Renderers new Reply to this message
Posted byJan_Klaassen
Posted on07/29/04 06:11 PM



> > Something similar happens in a lot other hardware as well, and you can solve
> it
> > by using a z-buffer for just the sprites (optimise it so it only kicks in when
> > needed and it will usually not impact performance much at all).
>
> But it's not just a case of sprites obstructing other sprites or causing them
> not to get drawn. They actually appear over sprites that have a higher
> scroll-plane priority. It seems like z-buffering would need to take into account
> the plane tiles as well.
>
> I really should investigate the Genesis again to see what actually happens. I
> think Charles has some documentation on his site.

That would be a bit trickier, yes, but you could still handle that with back-to-front drawing. If rendering speed is the goal and this trick isn't used often, back-to-front rendering will still be faster.




SubjectRe: Tile Renderers Reply to this message
Posted byfinaldave
Posted on07/29/04 06:46 PM



> > Something similar happens in a lot other hardware as well, and you can solve
> it
> > by using a z-buffer for just the sprites (optimise it so it only kicks in when
> > needed and it will usually not impact performance much at all).
>
> But it's not just a case of sprites obstructing other sprites or causing them
> not to get drawn. They actually appear over sprites that have a higher
> scroll-plane priority.

I think you are referring to the same effect.

I think it happens on both Genesis (in Sonic 2's intro like I said) and on CPS2 (e.g. in the SFA3 and Capcom Sports Club intros, and X-Men COTA in a special invisibilty move)


What happens is that if a low priority sprite obscures another high priority sprite, only the uncovered pixels appear high.
This makes it appear as if the sprite's silhouette has cut a hole in the high sprite.

From what I recall about Sonic 2, though it was a while ago that I looked at it, there is a low priority sprite in the shape of the arc of the SONIC banner. You don't see it because it's behind the background, it's sole purpose is so that it cuts a hole in Sonic and Tails, so they appear to be behind the logo.


Am I right?

I think MameDev call this sprite/scroll priority othogonality... or some combination of those words anyway ;-)

> It seems like z-buffering would need to take into account
> the plane tiles as well.
>
> I really should investigate the Genesis again to see what actually happens. I
> think Charles has some documentation on his site.
>


You learn something old everyday...



SubjectRe: Tile Renderers new Reply to this message
Posted byStainless
Posted on01/12/06 11:58 AM



one other trick I have used in the past is to take a snapshot of video memory after a render, then next pass through only render a tile if the memory has changed.

works very well on a speccy as it doesn't have any hardware sprites.




Previous ThreadView All ThreadsNext Thread*Show in Threaded Mode