lhTools - Sharp PC-1500 BASIC compiler/decompiler & LH 5801 assembler/disassembler
© Christophe Gottheimer, 1992-2014
Download C source code (v0.7.0) for Unix/Linux/*BSD/MAC-OSX:
Download Windows32 version (v0.7.0) (not tested yet):
Download full documentation in PDF: lhTools-0.7.0.pdf.
lhTools is a toolbox for dissasembling and decompiling the BASIC or machine language programs of the SHARP PC-1500/A and TRS-80 PC-2.
It consists of 2 programs:
- lhasm (formerly lhbin) : To produce a binary image from BASIC, assembler, or hexadecimal dump sources;
- lhdump : To decompile, disassemble and decode the binary images.
Note: This version is still in pre-alpha release. It is not fully mature and bugs may be present.
Disclaimer: You use this code at your own risk. I
am not responsible for any damage or any data lost or corrupted by
using this software or by using the binary images created with this
software while running them on a Sharp PC-1500/A or TRS-80 PC-2. Be sure
to save your important data or programs before loading and running the
binary images. C.G.
Usage: ./lhasm [-h] [-v] [-d] [-e] [-nc] [-ns] [-i] [-1] [-x] [-E] [-W] [-D symbol=value...]
[-T | -L logfile ] [-F fragfile] [-K keywordfile] [-S symfile] [-M macfile]
[-O origin] [fragment] [-N | -no] [-o outfile] srcfile
-h This help
-v Show version and exit
-d Show debug information
-e Enable verbose mode
-nc Disable comment copy into log file (if -T or -L set)
-ns Disable symbols/variables list into log file (if -T or -L set)
-i Immediate one pass only assembler. Read from stdin
-1 Do assembler pass 1 only and stop
-x Output hexadecimal dump instead of binary into outfile (.hex)
-E Warnings treated as errors
-W Errors treated as warnings. Do not stop on errors
-D symbol=value Define the specified symbol to the given value
Several -D may be given if several symbols need to be defined
-T Enable trace mode output to stderr.
This is exclusive with the -L log option
-L logfile Output logs of the assembler processing into logfile (.log)
This option is exclusive with the -T option
-F fragfile Output fragments to fragfile (.frag)
-K keywordfile Output declared BASIC keywords to keywordfile (.keyw)
-S symfile Output declared symbols to symfile (.sym)
-M macfile Output defined macros to macfile (.mac)
-O origin Set origin address to specified value
fragment Set original fragment
-B BASIC fragment; This is the default
-R RESERVE fragment
-X XREG fragment
-V dynamic VARiables fragment
-c CODE fragment
-b BYTE (8-bits) fragment
-w WORD (16-bits) fragment
-l LONG (32-bits) fragment
-t TEXT fragment
-k KEYWORD fragment
-H HOLE fragment
-N|-no Do not output generated binary code
-o outfile Output binary code into outfile (.bin)
If srcfile has a .bas extension, the BASIC mode is assumed.
If srcfile has a .asm extension, the CODE mode is assumed.
If srcfile has a .hex extension, the BYTE mode is assumed.
The following mnemonics are understood by lhasm:
These mnemonics are aliases and provided as a standalone instruction to help in coding.
LDW := SBR (&C0)
LJNE k,d := SBR (&C2)
JNE k,d := SBR (&C4)
BKW := SBR (&C6)
LJNES d := SBR (&C8)
STS (n) := SBR (&CA)
LDS (n) := SBR (&CC)
VAR n,d := SBR (&CE)
INTG n,d := SBR (&D0)
ARG d,n := SBR (&D2)
STB n := SBR (&D4)
LDB n := SBR (&D6)
IFC := SBR (&D8)
STVP := SBR (&DA)
LDPT := SBR (&DC)
EVAL d := SBR (&DE)
ERRH := SBR (&E0)
RST := SBR (&E2)
ERR1 := SBR (&E4)
LYX := SBR (&E6)
NORM := SBR (&E8)
SLX := SBR (&EA)
CLX := SBR (&EC)
ADN := SBR (&EE)
SXY := SBR (&F0)
CLS := SBR (&F2)
LDU (mn) := SBR (&F4)
STU (mn) := SBR (&F6)
These instructions are aliases and are provided for backward compatibility:
OUTA := ATP
RSET := OFF
STA T0 := AM0
STA T1 := AM1
STI := LDI
SWA := AEX
SWP := AEX
SLD := RLD
SRD := RRD
Convention for the mnemonics describe above:
n := Byte 8-bits value, within 0..255 (&FF)
mn := Word 16-bits value, within 0..65535 (&FF)
(n) := Indirect 8-bits value, within 0..255 (&FF)
(mn) := Indirect 16-bits value, within 0..65535 (&FFFF)
cc := Condition: C, NC, V, NV, Z, NZ, V, NV, ==, !=, <, >=
d := 8-bits displacement, within 0..255
rh := High 8-bits register: B, D, H, M
rl := Lowe 8-bits register: C, E, L, N
R := Whole 16-bits register: BC, DE, HL, MN
(R) := Indirect whole 16-bits register: (BC), (DE), (HL), (MN)
A := Accumulator
F := Flags (status)
PC := Program counter
SP := Stack pointer
T0 := Timer with 9th-bit to 0
T1 := Timer with 9th-bit to 1
[#] := Optional second page access
k := BASIC keyword code if k >= &E000 else a 8-bit value is assumed
1.2/ Base and character specifiers
When specifying an immediate value, the following specifiers are understood:
n := Hexadecimal 8-bits value (2-digits) within 00..FF
&n := Hexadecimal 8-bits value (1 to 2-digits) within &00..&FF
#n := Decimal 8-bits value (1 to 3-digits) within #0..#255
@n := Octal 8-bits value (1 to 3-digits) within @0..@377
\xn := Hexadecimal 8-bits value (1 to 2-digits) within &00..&FF
\un := Decimal 8-bits value (1 to 3-digits) within #0..#255
\on := Octal 8-bits value (1 to 3-digits) within @0..@377
$c := Character ASCII code of c
mn := Hexadecimal 16-bits value (2-digits) within 0000..FFFF
&mn := Hexadecimal 16-bits value (1 to 2-digits) within &0000..&FFFF
#mn := Decimal 16-bits value (1 to 3-digits) within #0..#65535
@mn := Octal 16-bits value (1 to 3-digits) within @0..@177777
\Xmn := Hexadecimal 16-bits value (1 to 2-digits) within &0000..&FFFF
\Umn := Decimal 16-bits value (1 to 3-digits) within #0..#65535
\Omn := Octal 16-bits value (1 to 3-digits) within @0..@177777
1.3/ Unary operators
When specifying an immediate value, i.e, d, n, or mn, unary operator may be added as follow:
+n := Positive displacement, PC+d
-n := Negative displacement, PC-d
*mn := Offset between current and mn
<mn := High 8-bits from the= mn value, ie, m
>mn := High 8-bits from the mn value, ie, n
!mn := &FFFF XOR'ed 16-bits or 8-bits value
^n := Set bit to '1 if n (1..16) or 0 (if n = 0)
'mn := First bit to '1 in mn starting from left
/mn := First bit to '1 in mn starting from right
~mn := Swap byte m and n for 16-bits value
~n := Swap digits nH and nL for 8-bits value
When specifying a register, unitary operator may be added as follow:
#<rR := High 8-bits register from rR ie, B, D, H, M
#>rR := Low 8-bits register from rR, ie, C, E, L, N
#^rR := Whole 16-bits register from rR, ie, BC, DE, HL, MN
#*rR := Indirect 16-bits register from rR, ie, (BC), (DE), (HL), (MN)
rR is any register: rh, rl, R, (R)
When specifying a condition, unitary operator may be added as follow:
#!cc := Return the inverse condition as shown below:
!cc := Ncc
!Ncc := cc
#!!= := ==
#!== := !=
#!>= := <
#!< := >=
1.4/ Symbols and variables
A symbol is global to the file and may be used at any time. The symbols defined
in a source code may be savec into a .sym file by using -S 'symfile' option.
A symbol is declared by setting its name followed by : (colon). Note
that no instruction is allowed after a symbol declaration.
With this, the immediate PC value is affected to the symbol.
To define a specific value to a symbol, use .EQU 'value'.
The name of a symbol should not start with . \ or % because these characters are reserved for special usage.
DOBEEP1 .EQU e669
Running lhasm will give:
2 .CODE 40C5
3 40C5 DOBEEP1: .EQU E669
4 40C5 B5 10 LDA 10
5 40C7 LOOP: .EQU 40C7
6 40C7 FD C8 PUSH A
7 40C9 BE E6 69 CALL DOBEEP1
8 40CC FD 8A POP A
9 40CE DF DEC A
10 40CF 99 0A JR NZ LOOP
11 40D1 9A RET
A variable is like a symbol but it is local to a source and variables are not saved by the -S
options. A variable name starts with % and has the form %nnc where nn
is a 2 digits number from 00 to 99 and c is lowercase character from a
to z. A maximum of 2600 variables may be declared.
%10a .EQU 10
Running lhasm will give:
2 .CODE 40C5
3 40C5 %10a .EQU 0010
4 40C5 6A 10 LD L %10a
5 40C7 %01l .EQU 40C7
6 40C7 49 00 AND (BC) 00
7 40C9 44 INC BC
8 40CA 88 05 DJC %01l
9 40CC 9A RET
1.5/ Using macro
A macro is code to be developed each time it is found in the source.
Imagine we want to have an instruction as JR > which does not exits.
Just create a macro called JR> and when the assembler will find
JR> label it will expand this code. The parameter label will be
passed to the code and substituted according to the macro rules.
A macro is defined by the directive:
followed by any code, with the eventual substitution marker and is terminated by
The subtitution marker are on the form __#n where n is within 0..9
When the macro is found in the code, the first parameter after the
name is __#0, the second __#1, and so on, until the 10th and last
An example with the macro JR>
JR ==,+02 ; If Z values are equal, test is false
JR >=,__#0 ; If C, the test is true
And the following source:
RET ; test false
gt: ; Greater than
196 01FF B5 10 LDA 10
197 0201 48 09 LD B 09
198 JR> gt
198 0203 8B 02 JR == +02 ; If Z values are equal test is false
198 0205 83 01 JR >= gt ; If C the test is true
199 0207 9A RET ; test false
200 0208 gt: .EQU 0208
By mixing the unary operators and substitution marker, some powerful macro may be defined:
LD #<__#0,<__#1 ; rH register loaded with high 8-bits
LD #>__#0,>__#1 ; rL register loaded with low 8-bits
The macro LDR is no define and will expand code to load the 16-bits value into a whole 16-bits register.
6 LDR HL 8899
6 40C5 68 88 LD #<__#0 <__#1 ; rH register loaded with high 8-bits
6 40C7 6A 99 LD #>__#0 >__#1 ; rL register loaded with low 8-bits
When developing complex macros, it is also necessary to have some
labels for jumps or addresses related into the macro. Because the macros
are reentrant, the labels should be available only inside the macro. To
do this the 10 labels 0: 1: .. 9: are available inside a macro. Note
that the label x: should NOT be followed by an instruction.
The macro XFER will do a copy in reverse from BC to DE until L is not
&FF, but stops if the bit 7 (&80 := ^80 ) is set.
LD B <__#0
LD C >__#0
LD D <__#1
LD E >__#1
LD L >__#2
LDI (BC) ; Load A with (BC) and increment BC
STD (DE) ; Store A to (DE) and decrement DE
BIT ^08 ; Bit 7 of A is set
JR C,2: ; Yes ! XFER is finished
DJC 1: ; Decrement L and jump to 1: if not C
And to transfer the BASIC A$ variable to &47FF, do
XFER A$ 47FF \u15
21 XFER A$ 47FF \u15
21 40C5 48 78 LD B <__#0
21 40C7 4A C0 LD C >__#0
21 40C9 58 47 LD D <__#1
21 40CB 5A FF LD E >__#1
21 40CD 6A 0F LD L >__#2
21 40CF 1:
; #XFER__00__#1 .EQU 40CF
21 40CF 45 LDI (BC) ; Load A with (BC) and increment BC
21 40D0 53 STD (DE) ; Store A to (DE) and decrement DE
21 40D1 BF 80 BIT ^08 ; Bit 7 of A is set
21 40D3 93 02 JR C 2: ; Yes ! XFER is finished
21 40D5 88 08 DJC 1: ; Decrement L and jump to 1: if not C
21 40D7 2:
; #XFER__00__#2 .EQU 40D7
21 40D7 9A RET
1.6/ Assembly inlining in BASIC program
To simplify to introduction of assembly code inside BASIC instructions like REM,
POKE and DATA or when assigning a $ varible, it is now possible to call the assembler
while a BASIC fragment is active.
The syntax is the following
'basic line num' ...'inst':...'inst' \asm[
assembly code, with symbols, variables and macros
Note the \asm[ should be at the end of the source line and \]end at the beginning of
a source line followed by a space.
A small example below:
10 REM \asm[
%80h .EQU ^08
20 POKE A, \asm[
str: .EQU .
\$A \$B \$C
40 DATA \asm[
Running lhasm on this source will give:
2 .MACRO: LDBC_nn
3 ; LDBC_nn: 1 LD B,<__#0
4 ; LDBC_nn: 2 LD C,>__#0
5 .ENDMACRO ; LDBC_nn
7 40CA 10 REM \asm[
8 40CA %80h .EQU 0080
9 40CC LDA 00
10 LDBC_nn 7750
10 40CE LD B <__#0
10 40D0 LD C >__#0
11 40D2 LD L %80h
12 40D2 loop: .EQU 40D2
13 40D3 STI (BC)
14 40D5 DJC loop
15 40D6 RET
16 40C5 00 0A 0F F1 AB \]end
B5 00 48 77 4A
50 6A 80 41 88
03 9A 0D
17 40DE 20 POKE A, \asm[
18 40E5 SBR (F2)
19 40F1 CALL &ED00
20 40F5 RET
21 40D7 00 14 1C F1 A1 \]end
41 2C 26 43 44
2C 26 46 32 2C
26 42 45 2C 26
45 44 2C 26 30
30 2C 26 39 41
22 40FD 30 E$="\asm[
23 LDBC_nn str
23 40FF LD B <__#0
23 4101 LD C >__#0
24 4102 RET
25 4102 str: .EQU 4102
26 4105 \$A \$B \$C
27 40F6 00 1E 12 45 24 \]end EFGH"
3D 22 48 41 4A
02 9A 41 42 43
45 46 47 48 22
28 4110 40 DATA \asm[
29 4117 PUSH HL
30 411F PUSH BC
31 412B CALL BEEP1
32 4133 POP BC
33 413B POP HL
34 413F RET
35 410B 00 28 32 F1 8D \]end
26 46 44 2C 26
41 38 2C 26 46
44 2C 26 38 38
2C 26 42 45 2C
26 45 36 2C 26
36 39 2C 26 46
44 2C 26 30 41
2C 26 46 44 2C
26 32 41 2C 26
39 41 0D
36 4140 00 32 03 F1 8E 50 END
4146 FF [END BASIC MARKER]
and the following BASIC program:
10 REM \B5\00HwJPj\80A\88\03\9A
20 POKE A,&CD,&F2,&BE,&ED,&00,&9A
40 DATA &FD,&A8,&FD,&88,&BE,&E6,&69,&FD,&0A,&FD,&2A,&9A
1.7/ Assembler directives
.ORIGIN: 'base addr'
Set 'base addr' as new origin address.
End the assembler and upate pointers for saving binary file. If -ns is
not specified and -T or -L logfile are given, the symbols and variables
defined are listed after a .SYMBOLS: banner. If -ns is set, the symbols
and variables are not listed.
Set a comment to the current fragment.
Enter into BASIC fragment. BASIC lines are compiled. A BASIC line
start wih a number 1..65529 followed by BASIC keyword or expression.
Enter into CODE fragment. LH5801 mnemonics are assembled.
Enter into BYTE fragment. Bytes 8-bits values are compiled.
Enter into WORD fragment. Words 16-bits values are compiled.
Enter into LONG fragment. Longs 32-bits values are compiled.
Enter into TEXT fragment. Text between " are compiled.
Enter into HOLE fragment. Obscure area. Only .SKIP 'nn' is expected
to skip bytes.
Enter KEYWORD fragment. The BASIC keyword table is built. The word
pointers area is updated. Note that .KEYWORD is expected to be specified
on a 2048bytes frontier + &54, i.e, &0054, &0854, &1054, etc...
.DEFINE: "'keyword'" = 'code'
Define 'keyword' with the 'code' as a new BASIC keyword. The entry point
is fixed to the current PC address.
Perform a checksum computation and write checksum value as a 16-bits
word at the current address. The checksum is computed from the first
.ORIGIN: and up to the current address.
If ('code') is given. The checksum will be stored after putting 'code'.
If + is given before ('code'), the 'code' will be added to the checksum
Define a new macro 'name'. All code given is assumed to be part of the macro until .ENDMACRO is encountered.
End the current macro.
Include the file 'file'. If the file is already included nothing is done.
Dummy directives handled for backward compatibility with lhdump.
1.8/ Immediate assembler
The standard assembler has two-passes. But it is also possible to generate code
immediately by calling the immediate assembler, ie, one-pass only with the option
-i. In this case, the source code is read from stdin and if the trace mode is
redirected to stderr (option -T), the immediate code and informations are printed.
To exit from immediate assembler, use CTRL+D. Exiting by CTRL+C will not write a
binary file, and the generation of symbols, fragments and macros files may be
disturbed by CTRL+C.
If no -o 'binfile' option is given, stdin.bin is used as output binary file.
Note that when running with immediate assembler, variables and symbols should be
defined to correct value BEFORE assembling, else an error may generated due to
to bad value or undefined. But macro definition and expansion are usable with the
./lhasm -T -i -O 40C5 -c
1 40C5 .CODE
2 40C5 34 CLA
3 40C6 48 79 LD B 79
4 40C8 4A 00 LD C 00
5 40CA loop: .EQU 40CA
6 40CA F5 STI
7 40CB 4E C0 CP C C0
JR NC loop
8 40CD 91 05 JR NC loop
9 40CF 9A RET
Return to the main page