Page 1 of 2

Oric programs statistics

Posted: Wed May 13, 2015 10:27 pm
by Symoon
Hi there,
I have counted, in 900 TAP files (representing 15MB of data) the bytes values.

The result is quite interesting.
We are mainly filling our Oric memories with... Nothing !
Let me show you the most found values:
Oric_stats.gif
Oric_stats.gif (7.37 KiB) Viewed 16661 times
Yes, in total %, these values are almost 20% of the Oric programs.

Now guys you can compress with 2 bits instead of 8, 20% of our Oric programs ;)

Oh BTW, the least used value?
It's 219 (decimal), used 5292 times.

I should now count the bits, and see if Oric was right with CSAVE, having 1 shorter than 0! ;)

Re: Oric programs statistics

Posted: Wed May 13, 2015 10:48 pm
by Chema
He he nice finding Symoon :)

Did you analyze the tap file or the Oric memory once it was loaded? I guess the former, which is surprising...

Or not... I trend to ignore the empty buffers when creating a tap file, as load times are nowadays quite low (either disk, or tap2cd or emulation) :)

Re: Oric programs statistics

Posted: Wed May 13, 2015 11:02 pm
by Symoon
Chema wrote:Did you analyze the tap file or the Oric memory once it was loaded?
It's 100% based on TAP files, almost all being from commercial tapes of the 80's.

I suspect recent programs would be more optimized, especially regarding the "UUUUU" ;)
Still testing my tool, should release it soon (Excel-based again, I re-used TCC code to have it quickly done).

I was expecting #FF to be in the top ten, but it's only in 71th position!

Re: Oric programs statistics

Posted: Thu May 14, 2015 6:54 am
by coco.oric
:lol: ... and what are the software records in the use of SPACE, U, null and @ ?

Re: Oric programs statistics

Posted: Thu May 14, 2015 7:49 am
by Symoon
coco.oric wrote::lol: ... and what are the software records in the use of SPACE, U, null and @ ?
Well, I don't have individual results and the process is quite slow so I didn't always stay in front of the computer while the figures were incresing, but I saw Jogger had lots of UUUU!
I just checked, the Jogger TAP file is 31k big, and holds a part of 18,4k of U... :shock:

Edit: seems I already noticed it while doing the transfer, this is from my transfer notes: "Strange, the program seems to be mostly filled with useless garbage, and could have been much shorter to load." ;)

Re: Oric programs statistics

Posted: Thu May 14, 2015 9:00 am
by Chema
I understand that, if you just save all your program in one block without being careful and avoiding empty tables and unused space, you can come up with something like that.

Buffers initialized to 0 could be common, but also with $40 for empty graphics (with no attributes) and even $20 which are empty spaces in the TEXT screen.

What is more strange is having blocks of $55. I guess that when developing at that time, you wrote your assembly routines and put them in specific parts of the memory. When your program is ready you have many blocks of code here and there. If you don't load them separately you end up saving garbage.

A good example is when you write a small program but redefine the character set. If you save it all as one block you may include a lot of $55 from where your program ends up to the character set data!

Anyway interesting :)

Re: Oric programs statistics

Posted: Thu May 14, 2015 12:20 pm
by Godzil
Chema: I'm even sure that some game/app may have same remnant of old code that was overrided by a newer version of a function.

You see that quite often in ROM binaries where some part are unused and still hold values that proved to be part of a function that has been changed and this part is no longer used.

Re: Oric programs statistics

Posted: Thu May 21, 2015 11:32 pm
by Symoon
Ok, I made statistics on 3450 TAP files.
Our Oric programs are globally made of 62% of "0" bits, and 38% of "1" bits.

That means Oric didn't choose well the tape encoding, where "0" is 33.33% longer to save than "1".
By inverting this, our programs would globally have been 8% shorter to CSAVE/CLOAD.
:D

Re: Oric programs statistics

Posted: Fri May 22, 2015 7:43 am
by ibisum
Wow, that is fascinating! :)

Re: Oric programs statistics

Posted: Fri May 22, 2015 8:02 am
by iss
Very interesting indeed!
One more possible source for 0's are the paddings for page aligned arrays in machine code programs.

Re: Oric programs statistics

Posted: Fri May 22, 2015 9:46 am
by Godzil
But is it true for every programs?

Because it seems that lots of apps store arrays of 0 for unknown reason

Re: Oric programs statistics

Posted: Fri May 22, 2015 3:30 pm
by ibisum
>0 for unknown reason

Initialized RAM default?

Seems like an opportunity missed. Hindsight is 20/20, but geeze ..

Re: Oric programs statistics

Posted: Fri May 22, 2015 6:24 pm
by Symoon
Godzil wrote:But is it true for every programs?
Well, it's just an average of course, based on TAP files on my hard drive, so the statistic is what it is ;)
I guess we'd have different profiles for basic programs, ASM programs, and data blocks.
Here, I just let the tool run for a few hours and analyse the 426 millions of bits!

I made the test on a single file (Zorgons Revenge) and got more or less the same results for the bits (63.5% of zeroes). The bytes frequency was different obviously, most used were the equivalent of "empty hires screen", LDA and STA ;)
ibisum wrote:>0 for unknown reason
Initialized RAM default?
Default system RAM initialization, if I'm not mistaken, is a 01010101 pattern (which gives the letter U)

Re: Oric programs statistics

Posted: Fri May 22, 2015 9:14 pm
by Dbug
My own code has a lot of zeroes in it.

The reason is simple: Page alignment for critical routines and arrays to avoid paying additional access penalty cycles when moving across a page boundary.

Re: Oric programs statistics

Posted: Sat May 23, 2015 5:52 am
by Symoon
Oricium has 70% of zeroes ;) (bits, not bytes)
Also tested individually:
- Morts Subites (text basic adventure game)
- Crypt Show (hires advennture game)
- Defence Force
- 3D Fongus

All are between 60% and 65% of zeroes.