Smooth Scrolling on the ZX Spectrum (Part 1)
The Spectrum screen handling is not the most straightforward and, with lack of hardware sprites, scrolling, and an awkward layout, can be intimidating at first. This post details how I managed to scroll a 24×24 character block of tiles on the Spectrum within one VBLANK interval.
The scroll routine employs a number of tricks to achieve this by:
- Using the stack pointer (SP) to fetch tile data and push it to the screen.
- Using self-modifying code to write out the data for each tile row.
- Limiting the number of unique tiles pe row to 6.
- Optimising all data (map and tiles) .
This post takes excerpts from a scrolling game I’m in the process of writing, and a compilable demo can be found on the demos page on this site. As ever, you are currently free to use ideas or code from my site for non-commercial purposes for nothing more than a thank you and attribution, with a link to my site.
Using the stack pointer to read and write buffers
First we need to discuss why the stack pointer (SP) is a quick alternative to reading and writing data.
The single byte stack instructions (POP or PUSH AF, BC, DE and HL) are able to read or write 2 bytes of data to memory and increment the pointer to the next location in 10 T-states. The extended instructions (POP or PUSH IX and IY) are two byte instructions and take 14 T-states.
The alternative is to use two LDI or LDD instructions (32 T-states), or discrete code (26 T-states).
The speed advantages are clear. However, as the stack pointer is used to push the return address after a CALL instruction, or any other data the programmer decides to store on the stack, care must be taken to make sure that this is done with either interrupts disabled, or within an interrupt routine, and to save and restore the stack pointer around any code that uses it.
Writing tile data to the screen
This routine is responsible for writing a single row of tiles to the screen:
; Write out a single row of tiles
; HL - Address of the tileset for this row of tiles
; DE - Screen Address
; IX - Address of the routine to push out a pixel row of the tiles
;
Scroll_Tile_Full: LD B, 16 ; Set the tile height in B
Scroll_Tile_Part: LD (Scroll_03 + 1), SP ; Save the stack pointer
LD (Scroll_02 + 1), IX ; Save the draw line routine
Scroll_01: LD SP, HL ; Point the stack at the tileset
LD HL, 12 ; Go to the next line of the tileset
ADD HL, SP
EXX ; Switch to alternate registers
POP BC,DE,HL,AF,IX,IY ; Pop the tileset into the AF, BC, DE', HL', IX and IY
EXX ; Switch back to normal registers
EX DE, HL ; Swap the screen address (in DE) into HL
LD SP, HL ; And load into the stack pointer
EXX ; Switch back to the alternate registers
Scroll_02: JP 0 ; Write out a row of pixels
Scroll_Ret: EXX ; Switch back to the normal registers
EX DE, HL ; Swap DE and HL back again
INC D ; Drop down to the next pixel line of the screen
LD A, D
AND 0x07
JR NZ, Scroll_04 ; If we've gone over a character boundary, then
LD A, E ; Drop down one character in screen memory
ADD A, 32
LD E, A
JR C, Scroll_04 ; If we've gone over a screen third boundary
LD A, D ; Drop down to the next third
SUB 8
LD D,A
Scroll_04: DJNZ Scroll_01 ; Loop
Scroll_03: LD SP, 0 ; Restore the stack pointer
RET
The routine assumes that there can only be 6 unique tiles per row, and the data for a single pixel row of each tile is pulled into the alternate registers by pointing the stack at the tileset and performing a series of POP instructions:
Scroll_01: LD SP, HL ; Point the stack at the tileset
LD HL, 12 ; Go to the next line of the tileset
ADD HL, SP
EXX ; Switch to alternate registers
POP BC,DE,HL,AF,IX,IY ; Pop the tileset into the AF, BC, DE', HL', IX and IY
EXX ; Switch back to normal registers
The tilesets for each row are arranged so that their pixel rows are adjacent in memory. The DG instruction in SJASMPLUS allows for data to be entered in binary within the assembler, with whitespace representing 0. So each row is 6 words wide.
; POP BC DE HL AF IX IY
;
DG ---------------- ---------------- ---------------- 11-11111-1111111 ----1111111111-1 111-1----11-----
DG ---------------- ---------------- ---------------- -1111111111-111- ----111111111111 -1111-----------
DG -----------1---- ---------------- -----1------1--- -1111-111-1111-- --------11111111 1-11111---------
DG --1------------- --1----1-------1 1--------------- -1-1111-111-1--- ----------11111- 11111111--------
DG ---------------- ---------------1 11-------------- 111111111111--1- --------111111-- 111-11-11-------
DG ------1--------- -----------1--11 111---1----1---- -1-11-11-111-1-- ---1---1111-1--- 111---1111------
DG ---------------- --------------11 111------111---- 11111111111----- ------111111---- 1111---1111-----
DG ------1--------- ------111-----11 1111---1111----- 11-111-111-----1 ------11-11----- -1-1------11----
DG -------------1-- ---1--11111----1 1111--1111------ 1111111111--1--- --1--11111------ -111-------1----
DG --1------------- -------11111---1 1111-1111------- 1111-1111-1---1- -----11--------- -111------------
DG -------1-------- --------11111--1 1111-1111---1--- -1111111-------- ----1----------- --11------------
DG ---------------- ---------1111111 111111-1-------- -1-1111----1---1 ---------------- ---1------------
DG ---------------- --1---1---111111 11111111-------- -11111--1----1-- ---------------- ----------------
DG -----------1---- ------------1111 111111111111---- -1111-----1----- ----1--------1-- ---------1------
DG -----1---------- ------------1111 1111-111111111-- 111--111-----111 ---------1------ ----------------
DG ---------------- ------11111-1111 1-111111-11111-- -11111-1111111-- ---------------- ----------------
If you look at the full routine, you will see that there is no code for writing the data back out using PUSH. That is done via some clever self-modifying code. The first bit is here.
Scroll_Tile_Part: LD (Scroll_03 + 1), SP ; Save the stack pointer
LD (Scroll_02 + 1), IX ; Save the draw line routine
As we are modifying the SP, we first store that directly into the LD SP, nn instruction on line 33. This routine is passed an address (in IX) of a routine that is going to write the row data out, and that address is stored directly into this JP instruction on line 18.
Scroll_02: JP 0 ; Write out a row of pixels
The routine to write the data out is itself a piece of self-modifying code based upon the map data that will be covered in the next section. The rest of the code in the routine deals with moving a pixel row down that awkward Spectrum screen memory map, and looping around to draw a 24 x 2 character block of tiles.
Pre-processing the scroll data
For the above routine to work, it needs 13 chunks of code to be written on the fly, one for each visible row on the screen, taking into account the top and bottom rows may not be full tiles, with 11 full tiles in-between. For illustrative purposes, I’ll detail how one row is created.
First, we need to allocate a space in memory for the code to be written into:
Scroll_Write_Row_00: DEFS 27, 0
This is 27 bytes long; 24 bytes for the push instructions, bearing in mind that PUSH IX and PUSH IY take up 2 bytes, and we need to write 12 tiles, and a further three bytes for a JP instruction at the end to jump back to the label Scroll_Ret in the scroll routine discussed in the previous chapter.
The map data is pre-shifted so that the tile number we are interested in is in the top nibble of the byte, so tile 0 is 0x00, tile 1 is 0x10, and so on. There is a macro MAP_ROW that enables me to efficiently store each map row as 12 tile numbers, preceded by the address of the tileset graphics for that row. The macro shifts the tile numbers for me.
MAP_ROW: MACRO tileset,C01,C02,C03,C04,C05,C06,C07,C08,C09,C10,C11,C12
DW tileset, 0x0000
DB C12<<4,C11<<4,C10<<4,C09<<4,C08<<4,C07<<4
DB C06<<4,C05<<4,C04<<4,C03<<4,C02<<4,C01<<4
ENDM
Map_Data: MAP_ROW Tileset_11, 1,1,1,0,0,2,0,0,1,1,1,3
MAP_ROW Tileset_11, 1,1,1,0,0,2,0,0,1,1,1,3
MAP_ROW Tileset_11, 1,1,1,0,0,2,0,0,1,1,1,3
MAP_ROW Tileset_11, 1,1,1,0,0,2,0,0,1,1,1,3
MAP_ROW Tileset_11, 1,1,1,0,0,2,0,0,1,1,1,3
...
As you can see, the map data has already been processed to ensure that the tiles are indexes into the tileset. The following routine is then responsible for reading in the map data and writing out the self-modifying code into the 27-byte buffer we’ve reserved previously:
; Read a map row in and write out the self-modding code
; Parameters:
; DE - Map
; HL - Buffer
; BC - Address to store the tileset address in
;
Initialise_Scroll_0: LD A, (DE) ; Get the tileset low byte
LD (BC), A ; Store in self mod code
INC DE
INC BC
LD A, (DE)
LD (BC), A
INC DE
INC DE ; Skip the spare two bytes
INC DE
LD B, 12 ; Numbe of tile columns to write out
LD C, 0x40 ; Use this to check for IX and IY
Initialise_Scroll_1: LD A,(DE) ; Get first map tile
AND 0xF0 ; Only interested in top nibble
CP C ; Are we writing out AF, BC, DE or HL
JR C, Initialise_Scroll_1B ; Yes, so jump to write out a single byte PUSH
JR Z, Initialise_Scroll_IX ; If exactly 0x40 then jump to write out PUSH IX
; Write out IY (FD E5) ; So we must be writing out IY at this point
LD (HL), 0xFD
INC HL
LD (HL), 0xE5
JR Initialise_Scroll_Ret
; Write out IX (DD E5)
;
Initialise_Scroll_IX: LD (HL), 0xDD
INC HL
LD (HL), 0xE5
JR Initialise_Scroll_Ret
;
; Write out AF, BC, DE, HL (C5, D5, E5 and F5)
;
Initialise_Scroll_1B: ADD A, 0xC5 ; Quickly add 0xC5 to the tile # to get the PUSH opcode
LD (HL), A ; Write out the PUSH instruction here!
;
; Write out final JP instruction (C3 LL HH)
;
Initialise_Scroll_Ret: INC HL ; Loop to next byte of memory to write out
INC DE ; And the next tile address
DJNZ Initialise_Scroll_1 ; Jump to next tile column
LD (HL), 0xC3 ; Here we're writing out a JP instruction
INC HL
LD (HL),low Scroll_Ret ; These are self-modding with the correct value at
INC HL ; top of function Initialise_Scroll with the
LD (HL),high Scroll_Ret
RET
The first few lines up to line number 15 read the header data for the row; the two bytes for the tileset graphic address, plus a couple of spare bytes I’ve reserved for game data. Incidentally, you may have noticed that each map row is 16 bytes wide; a very convenient number for calculating the index into the map data from a scroll position.
The code within the loop then reads in the map data, and converts that into a series of PUSH instructions.
LD C, 0x40 ; Use this to check for IX and IY
Initialise_Scroll_1: LD A,(DE) ; Get first map tile
AND 0xF0 ; Only interested in top nibble
CP C ; Are we writing out AF, BC, DE or HL
JR C, Initialise_Scroll_1B ; Yes, so jump to write out a single byte PUSH
JR Z, Initialise_Scroll_IX ; If exactly 0x40 then jump to write out PUSH IX
; Write out IY (FD E5) ; So we must be writing out IY at this point
LD (HL), 0xFD
INC HL
LD (HL), 0xE5
JR Initialise_Scroll_Ret
; Write out IX (DD E5)
;
Initialise_Scroll_IX: LD (HL), 0xDD
INC HL
LD (HL), 0xE5
JR Initialise_Scroll_Ret
;
; Write out AF, BC, DE, HL (C5, D5, E5 and F5)
;
Initialise_Scroll_1B: ADD A, 0xC5 ; Quickly add 0xC5 to the tile # to get the PUSH opcode
LD (HL), A ; Write out the PUSH instruction here!
Lines 18 to 22 mask out the top nibble, and compare with 0x40 (stored in register C for speed); if the value read from the map is less than that, we are writing out a single-byte instruction (AF, BC, DE or HL), so jump down to line 37, add 0xC5 to the tile value, and write that value to the buffer. This takes advantage of the fact that 0xC5, 0xD5, 0xE5 and 0xF5 are the byte codes for PUSH BC, PUSH DE, PUSH HL and PUSH AF respectively.
If the byte is 0x40, we jump to Initialise_Scroll_IX on line 30; this is a two-byte PUSH instruction with byte codes of 0xFD and 0xE5. Failing that we fall through and write out the instruction PUSH IY, with byte codes of 0xDD and 0xE5.
Finally, once all 12 PUSH instructions have been written, a JP instruction is written out:
;
; Write out final JP instruction (C3 LL HH)
;
Initialise_Scroll_Ret: INC HL ; Loop to next byte of memory to write out
INC DE ; And the next tile address
DJNZ Initialise_Scroll_1 ; Jump to next tile column
LD (HL), 0xC3 ; Here we're writing out a JP instruction
INC HL
LD (HL),low Scroll_Ret ; These are self-modding with the correct value at
INC HL ; top of function Initialise_Scroll with the
LD (HL),high Scroll_Ret
RET
SJASMPLUS has two special operators, low and high. These take a word value and return the low or high byte. Remember the Z80 is little-endian, so L before H.
Wrapping it up
There’s quite a lot to take in there, so to quickly summarise, the routine does the following:
- Read in the map data, and from that, dynamically write out some Z80 code, with a register pair (AF, BC, DE, HL, IX or IY) assigned to one of the 6 tiles allowed per row.
- Write out the map data by POPing each tileset’s pixel row into those registers, and then PUSHing the data onto the screen.
Subsequent chapters will detail how the map data is created, how sprites are written to the screen, and the timing involved to avoid flicker (racing the beam).