Рет қаралды 1,725
This is the new fully-trained left-well version of BetaTetris. Its average score on 12Hz 350ms is ~1.15mil, with an averaged 605k pre-transition and 16k post-post-transition score.
It also improved its 29-start 12Hz average score from 44k (the agent featured in the previous 219 lines video) to 48k, and it seems to do this by improving its scoring efficiency, including occasionally set up some column 3 tetrises (~2.4% TRT), as seen in this video.
There is an apparent issue that it is over-abusing its adjustment ability with lots of unnecessary adjustments. This is because I didn't add any penalties on adjustments, so the agent will just move a piece to a random spot initially, given that it can be adjusted to the desired position later. It is fixable by a little bit of incremental training with penalty; I just don't bother doing that now, and also it is somewhat amusing to watch it playing like that.
Source code, more statistics & model are available here: github.com/adrien1018/beta-te...
This was the best 2 games of ~150 games it played, so again, the fact that they happened b2b is just by pure luck. The pre-killscreen scores of these games are only slightly above average.