chr sheet of most used tiles

This is an archive of a topic from NESdev BBS, taken in mid-October 2019 before a server upgrade.
View original topic
chr sheet of most used tiles
by on (#204383)
So for no particular reason I thought it would be interesting to see what the most common tiles in all* games with CHR-ROM.

*: From a particular 2011 "nointro" ROM collection that was able to be parsed by "ines.py"

After some basic de-duplication of full sheets, I ran them through a script to count the frequency. The attached image shows 512 tiles starting from the from most occurring

Attachment:
2017-09-10_most-used-tiles.chr.png
2017-09-10_most-used-tiles.chr.png [ 2.71 KiB | Viewed 5543 times ]


Most of the tile are basic rectangles, shapes, letters from a common NES font, and X marks.

4 tiles popped up that seems to me should be unique, but appeared at least 300 times across my data set.

Attachment:
2017-09-10_intresting.png
2017-09-10_intresting.png [ 250 Bytes | Viewed 5543 times ]
Re: chr sheet of most used tiles
by on (#204385)
Interesting research! But I think it should be once per game. Some games just repeat tiles in unused spaces, so I think it would be more interesting to see which tiles appear in most games, not counting how many times they appear in any particular game.
Re: chr sheet of most used tiles
by on (#204390)
Attachment:
2017-09-11_most-shared-tiles.chr.png
2017-09-11_most-shared-tiles.chr.png [ 2.1 KiB | Viewed 5523 times ]

OK, That did cut out the appearances of the "unused tile" graphics. Now it's pretty much only basic monochrome shapes.
Each of these tiles appears in at least 5% of the games, with the top four being in about %97 of the games
Re: chr sheet of most used tiles
by on (#204393)
When I went and made the fingerprint of 128-byte slices in GoodNES, I was surprised at just how many times the slice that was the numbers "01234567" (in the same font) showed up.

In hindsight, I probably shouldn't have been.
Re: chr sheet of most used tiles
by on (#204394)
Funny how many basic shapes can exist. Now someone take this sheet and make the most average NES game ever. :)
Re: chr sheet of most used tiles
by on (#204398)
Analyzing what font characters are present…
Code:
   ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890 (for alignment/comparison)
01 ABCDEF H     NOP   TU        45      c,©,OK
nothing on color 02
03 AB DEF HI    NOP   TU    Z123456789  2B

█ in all four colors wins, natch
▒ in 10, 01, 21, 12, 03, 30, 32; all of which are sensible ones. Odd that 32 isn't present.
We can see that people use colors in 0312 frequency, mostly.
Oddly, that second sheet now admits © in color 3 when it wasn't there before, and about halfway up the ranking.

Interestingly, we don't have all of ETAOINsHrDlU, and I/C aren't in both colors! Part of this is, I'm sure, the various nature of renderings of S. L doesn't make sense, though. Also interesting that no separate 0 rendering came up, though again, various renderings to distinguish from O…and colliding/sharing with O in some cases.

I am amused that 10 of 5-16 of Shared are what almost certainly are sprite-0 triggers.

Could you cull those one-rectangle on other-color shapes? At least, the ones where the rectangle touches three edges. (Noting as numbers these shapes, like "left 7/8ths 1 on 0", might be more readable than in a character sheet.)
Could you make a version where it treats palette swaps as additional instances of the first tile encountered with that pattern? Possibly allowing BG to be swapped, possibly not; for sprites it means something, after all.

Of course, this simply misses all CHR-RAM games.
Re: chr sheet of most used tiles
by on (#204399)
And given unalignment and compression, I don't think there's a good way to reach CHR RAM games.
Re: chr sheet of most used tiles
by on (#204401)
Nothing as fast, surely. Running TAS or other game-completing keyrecordings through games on a CHR-logging emulator would be a naïve way to reach them, but, of course, is numerous magnitudes slower than what was done here.

If you wanted a less thorough scrape in that manner, perhaps, a wait for 1 minute, autofire a+start for a minute nputfile would probably get you through nigh-all Title, Menu, Pause, Game allocations. [waiting a minute as attract modes will be smörgåsbords!]
Re: chr sheet of most used tiles
by on (#204410)
Rectangle heuristic:
2-color [if one color, well, that's already a rectangle, no need for further testing…]
has at most two different 8-pixel slices, which both fit the pattern and the distribution of which fit one of the nine A*8-0, B*0-8 (0, 1, 3, 7, F, 1F, 3F, 7F, FF; the complementary FE, FC, F8, F0, E0, C0, 80 should be taken care of by the palette-swap logic)
Re: chr sheet of most used tiles
by on (#204412)
I wonder how practical it would be to search with deduplication of bitplane permutations...?
Re: chr sheet of most used tiles
by on (#204417)
This is really interesting!

Of course, that "X" and other tiles around it are for marking unused/empty tiles. The kanji there literally means "empty".
Re: chr sheet of most used tiles
by on (#204419)
I tried out normalizing bitplane permutations. rectangles combined as a side effect. Attached is the count script I used for this.
Attachment:
2017-09-11_bitplanes-normalized.png
2017-09-11_bitplanes-normalized.png [ 3.79 KiB | Viewed 5427 times ]

If some of the letters look odd with a gray background, it's because the normalization sets the top left pixel to be always color 0.

I might manually search beyond 512 tiles for the whole alphabet, but that'll require using my eyes and may take a bit of time.

In the meanwhile here's a slightly more readable version of above.
Attachment:
2017-09-11_readable.png
2017-09-11_readable.png [ 4.4 KiB | Viewed 5421 times ]


ccovell wrote:
The kanji there literally means "empty".

Ah, that makes sense. Thanks for the tidbit.

Edit: Former attachment is now inlined here:
Code:
#!/usr/bin/env python3

import sys
import collections
import functools

all_tiles = collections.Counter()

@functools.lru_cache(maxsize=None, typed=False)
def normalize_tile(input_tile):
    # not the most efficient way, but it works.
    color_0 = bytes(((0xff^a) & (0xff^b)) for a, b in zip(input_tile[0:8], input_tile[8:16]))
    color_1 = bytes((a & (0xff^b)) for a, b in zip(input_tile[0:8], input_tile[8:16]))
    color_2 = bytes(((0xff^a) & b) for a, b in zip(input_tile[0:8], input_tile[8:16]))
    color_3 = bytes((a & b) for a, b in zip(input_tile[0:8], input_tile[8:16]))
    norm = sorted([color_0, color_1, color_2, color_3])
    norm.reverse()
    plane_0 = bytes((a | b) for a, b in zip(norm[1], norm[3]))
    plane_1 = bytes((a | b) for a, b in zip(norm[2], norm[3]))
    return plane_0 + plane_1

for input_filename in sys.argv[1:]:
    input_chr = open(input_filename, 'rb').read()
    #all_tiles.update(input_chr[i*16:i*16+16] for i in range(len(input_chr)//16))
    tileset = set(normalize_tile(input_chr[i*16:i*16+16]) for i in range(len(input_chr)//16))
    all_tiles.update(tileset)

with open("result.chr",'wb') as output_file:
    for tile in all_tiles.most_common(512):
        output_file.write(tile[0])
        print(tile[1])
Re: chr sheet of most used tiles
by on (#204424)
JRoatch wrote:
In the meanwhile here's a slightly more readable version of above.
Attachment:
The attachment 2017-09-11_readable.png is no longer available

Oh, now this is interesting!

I can't help but think that if this was organized, it would heavily resemble the Intellivision's built-in GROM.

Attachment:
intvfont.png
intvfont.png [ 2.25 KiB | Viewed 5398 times ]
Re: chr sheet of most used tiles
by on (#204431)
Note that this is the most present tiles in the CHR-ROM not the most used, i.e. you ignore whether the games actually uses these tiles a lot, a little, or not at all, during gameplay. This is a major difference.

Quote:
I tried out normalizing bitplane permutations.

Did you only account for bitplane permutations (exchanging colours 1 and 2) or any colour permutation ? It seems you did the latter, which makes more sense.
Re: chr sheet of most used tiles
by on (#206977)
I extracted all unique 1- or 2-color tiles from 7080 NES games. There were about 800,000 such tiles. I normalized the colors within each tile: the more common color is always black and the less common color is white. (If there were 32 pixels of each, the colors were inverted if the top left pixel was white.)

I have attached an image of the tiles and the scripts I used. (Run at your own risk.)

Next, maybe try sorting the tiles by number of white pixels...
Re: chr sheet of most used tiles
by on (#207213)
qalle wrote:
Next, maybe try sorting the tiles by number of white pixels...

Not as interesting as I thought, but here you go.
Edit: optimized the image with pngout.