jcc.h
IntroductionUse the links in the table of contents to the left to access the documentation. Functions
cc_add_breakpointAdd a breakpoint at a specific program counter address. int cc_add_breakpoint( JCC *vm, long long *pc); ParametersReturn ValueBreakpoint index, or -1 if failed (too many breakpoints). cc_add_watchpointAdd a watchpoint at a specific memory address. int cc_add_watchpoint( JCC *vm, void *address, int size, int type, const char *expr); ParametersReturn ValueWatchpoint index, or -1 if failed (too many watchpoints). cc_compileCompile the parsed program (Obj list) into VM bytecode. void cc_compile( JCC *vm, Obj *prog); Parameterscc_debug_replEnter interactive debugger REPL (Read-Eval-Print Loop). void cc_debug_repl( JCC *vm); ParametersDiscussionProvides an interactive shell for debugging with commands like: - break/b: Set breakpoint - continue/c: Continue execution - step/s: Single step - next/n: Step over - finish/f: Step out - print/p: Print registers - stack/st: Print stack - help/h: Show help cc_defineDefine or override a preprocessor macro for the given VM. Parameterscc_destroyFree resources owned by an JCC instance. void cc_destroy( JCC *vm); ParametersDiscussionDoes not free the `JCC` struct itself; the caller is responsible for the memory of the struct if it was dynamically allocated. cc_dlopen(JCC *, const char *)Load a dynamic library and resolve all registered FFI functions. ParametersReturn Value0 on success, -1 on error. DiscussionThis function opens a dynamic library and attempts to resolve all currently registered FFI functions. Functions that cannot be resolved will print warnings but won't fail the entire operation. If lib_path is NULL, the function searches in default system libraries. Platform-specific behavior: - Unix: Uses dlopen/dlsym to load .so/.dylib files - Windows: Uses LoadLibrary/GetProcAddress to load .dll files The library handle is not closed after loading to keep function pointers valid. cc_dlopen(JCC *, const char *)Open a dynamic library. ParametersReturn Value0 on success, -1 on failure. cc_dlsym(JCC *, const char *, void *, int, int)Update an existing registered FFI function's pointer by name. ParametersReturn Value0 on success, -1 on error (function not found or signature mismatch). DiscussionThis function is useful for updating function pointers after loading a dynamic library, or for redirecting calls to different implementations. The function must already be registered via cc_register_cfunc or cc_register_variadic_cfunc. cc_dlsym(JCC *, const char *, void *, int, int)Resolve a symbol in a dynamic library. ParametersReturn Value0 on success, -1 on failure. cc_find_function_entryFind program counter address for a function entry point by name. long long *cc_find_function_entry( JCC *vm, const char *name); ParametersReturn ValueProgram counter address, or NULL if not found. cc_find_pc_for_sourceFind program counter address for a given source location. long long *cc_find_pc_for_source( JCC *vm, File *file, int line); ParametersReturn ValueProgram counter address, or NULL if not found. cc_get_source_locationGet source file location for a given program counter. int cc_get_source_location( JCC *vm, long long *pc, File **out_file, int *out_line); ParametersReturn Value1 if location found, 0 if not found. cc_includeAdd a directory to the compiler's header search paths. void cc_include( JCC *vm, const char *path); ParametersDiscussionThis adds the path to the list of directories searched for "..." includes (quote includes). cc_initInitialize an JCC instance. ParametersDiscussionThe caller should allocate an `JCC` struct (usually on the stack) and pass its pointer to this function. This sets up memory segments, default include paths, and other runtime defaults. cc_link_progsLink multiple parsed programs (Obj lists) into a single program. Obj *cc_link_progs( JCC *vm, Obj **progs, int count); ParametersReturn ValueA single merged Obj* linked list containing all objects. DiscussionTakes an array of Obj* programs and combines them into one linked list. This allows multiple source files to be compiled together into a single program. The function handles duplicate definitions by preferring definitions over declarations. cc_load_bytecodeLoad compiled bytecode from a file. int cc_load_bytecode( JCC *vm, const char *path); ParametersReturn Value0 on success, -1 on error. DiscussionDeserializes bytecode previously saved with cc_save_bytecode() and prepares the VM for execution with cc_run(). cc_load_libcLoad the platform's standard C library and resolve FFI functions. int cc_load_libc( JCC *vm); ParametersReturn Value0 on success, -1 on error. DiscussionThis function automatically detects and loads the correct C library for the current platform: - macOS: /usr/lib/libSystem.dylib - Linux: /lib64/libc.so.6 (or /lib/libc.so.6 on 32-bit) - FreeBSD: /lib/libc.so.7 - Windows: msvcrt.dll This is useful when you want to load stdlib functions dynamically instead of registering them with explicit function pointers. cc_load_stdlibRegister all standard library functions available via FFI. void cc_load_stdlib( JCC *vm); ParametersDiscussionAutomatically registers 50+ standard library functions including: - Memory: malloc, free, calloc, realloc, memcpy, memmove, memset, memcmp - String: strlen, strcpy, strncpy, strcat, strcmp, strncmp, strchr, strstr - I/O: puts, putchar, getchar, fopen, fclose, fread, fwrite, fgetc, fputc - Math: sin, cos, tan, sqrt, pow, exp, log, floor, ceil, fabs - Conversion: atoi, atol, atof, strtol, strtod - System: exit, abort, system, open, close, read, write This function is automatically called by cc_init(), but can be called manually if you want to reset the FFI registry or initialize it separately. cc_lookup_symbolLook up a debug symbol by name in current scope. DebugSymbol *cc_lookup_symbol( JCC *vm, const char *name); ParametersReturn ValuePointer to DebugSymbol if found, NULL otherwise. cc_output_jsonOutput C header declarations as JSON for FFI wrapper generation. void cc_output_json( FILE *f, Obj *prog); ParametersDiscussionSerializes function signatures, struct/union definitions, enum declarations, and global variables from the parsed AST to JSON format. The output includes full type information with recursive expansion of pointers, arrays, and aggregate types. Storage class specifiers (static, extern) are included. Function bodies are not serialized - only signatures. The JSON output format is: { "functions": [...], "structs": [...], "unions": [...], "enums": [...], "variables": [...] } cc_parseParse a preprocessed token stream into an AST and produce a linked list of top-level Obj declarations. ParametersReturn ValueLinked list of top-level Obj representing globals and functions. cc_parse_assignParse an assignment expression from token stream (stops at commas). Node *cc_parse_assign( JCC *vm, Token **rest, Token *tok); ParametersReturn ValueAST node representing the parsed assignment expression. DiscussionUsed for parsing function arguments and other contexts where commas are separators rather than operators. cc_parse_compound_stmtParse a compound statement (block) from token stream. Node *cc_parse_compound_stmt( JCC *vm, Token **rest, Token *tok); ParametersReturn ValueAST node representing the parsed compound statement. cc_parse_exprParse a single C expression from token stream. Node *cc_parse_expr( JCC *vm, Token **rest, Token *tok); ParametersReturn ValueAST node representing the parsed expression. cc_parse_stmtParse a single C statement from token stream. Node *cc_parse_stmt( JCC *vm, Token **rest, Token *tok); ParametersReturn ValueAST node representing the parsed statement. cc_preprocessRun the preprocessor on a C source file and return a token stream. Token *cc_preprocess( JCC *vm, const char *path); ParametersReturn ValueHead of the token stream (linked Token list). Caller owns tokens. cc_print_stack_reportPrint stack instrumentation statistics and report. void cc_print_stack_report( JCC *vm); ParametersDiscussionOutputs stack usage statistics including high water mark, variable access counts, and scope information. Only useful when stack instrumentation is enabled. cc_print_tokensPrint a token stream to stdout (useful for debugging the preprocessor and tokenizer). void cc_print_tokens( Token *tok); Parameterscc_register_cfuncRegister a native C function to be callable from VM code via FFI. void cc_register_cfunc( JCC *vm, const char *name, void *func_ptr, int num_args, int returns_double); ParametersDiscussionRegistered functions can be called from C code compiled to VM bytecode. The CALLF instruction handles argument marshalling. All integer types are passed/returned as long long, floats as double. cc_register_variadic_cfuncRegister a variadic native C function to be callable from VM code via FFI. void cc_register_variadic_cfunc( JCC *vm, const char *name, void *func_ptr, int num_fixed_args, int returns_double); ParametersDiscussionThis function is only available when JCC_HAS_FFI is defined. When libffi is not available, use fixed-argument wrappers instead. Variadic functions accept a variable number of arguments after the fixed arguments. Example: printf has 1 fixed arg (format), fprintf has 2 (stream, format). cc_remove_breakpointRemove a breakpoint by index. void cc_remove_breakpoint( JCC *vm, int index); Parameterscc_remove_watchpointRemove a watchpoint by index. void cc_remove_watchpoint( JCC *vm, int index); Parameterscc_runExecute the compiled program within the VM. ParametersReturn ValueProgram exit code returned by main(). cc_save_bytecodeSave compiled bytecode to a file for later execution. int cc_save_bytecode( JCC *vm, const char *path); ParametersReturn Value0 on success, -1 on error. DiscussionSerializes the text segment, data segment, and necessary metadata to a binary file. The file can be loaded and executed by cc_load_bytecode(). cc_set_asm_callbackRegister a callback invoked for `asm("...")` statements. void cc_set_asm_callback( JCC *vm, JCCAsmCallback callback, void *user_data); Parameterscc_system_includeAdd a directory to the compiler's system header search paths. void cc_system_include( JCC *vm, const char *path); ParametersDiscussionThis adds the path to the list of directories searched for <...> includes (angle bracket includes). System include paths are searched after regular include paths for "..." includes. cc_undefRemove a preprocessor macro definition from the VM. ParametersTypedefs
AllocHeaderMetadata header stored before each heap allocation for tracking. typedef struct AllocHeader { size_t size; // Allocated size (rounded, excluding header) size_t requested_size; // Original requested size (for bounds checking) int magic; // Magic number for debugging (0xDEADBEEF) long long canary; // Front canary (if heap canaries enabled) int freed; // 1 if freed (for UAF detection) int generation; // Generation counter (incremented on free) long long alloc_pc; // PC at allocation site (for leak detection) int type_kind; // Type of allocation (TypeKind enum, for type checking) } AllocHeader; Fields
AllocRecordTracks an active heap allocation for leak detection. typedef struct AllocRecord { struct AllocRecord *next; void *address; size_t size; long long alloc_pc; } AllocRecord; FieldsCondInclStack entry used to track nested #if/#elif/#else processing during preprocessing. typedef struct CondIncl { struct CondIncl *next; enum { IN_THEN, IN_ELIF, IN_ELSE } ctx; Token *tok; bool included; } CondIncl; DebugSymbolRepresents a variable's debug information for expression evaluation. typedef struct DebugSymbol { char *name; // Variable name long long offset; // BP offset (locals) or data segment address (globals) Type *ty; // Variable type int is_local; // 1=local (BP-relative), 0=global (data segment) int scope_depth; // Scope depth for shadowing resolution } DebugSymbol; FieldsEnumConstantRepresents an enumerator constant within an enum type. typedef struct EnumConstant { char *name; int value; struct EnumConstant *next; } EnumConstant; FieldsFileRepresents the contents and metadata of a source file. typedef struct File { char *name; int file_no; char *contents; // For #line directive char *display_name; int line_delta; } File; FieldsForeignFuncRepresents a registered foreign (native C) function callable from VM code. typedef struct ForeignFunc { char *name; void *func_ptr; int num_args; int returns_double; int is_variadic; // 1 if function is variadic (e.g., printf), 0 otherwise int num_fixed_args; // For variadic functions, number of fixed args (rest are variable) #ifdef JCC_HAS_FFI ffi_cif cif; // libffi call interface ffi_type **arg_types; // Array of argument types (NULL if not prepared) #endif } ForeignFunc; Fields
FreeBlockFree list node for tracking freed memory blocks. FieldsGotoPatchRecords a forward jump (JMP) that must be patched once the destination label is defined. typedef struct GotoPatch { char *name; // Label name to jump to char *unique_label; // Or unique label identifier long long *location; // Location of JMP instruction's address operand } GotoPatch; HashEntrySimple key/value bucket used by the project's HashMap. typedef struct HashEntry { char *key; int keylen; void *val; } HashEntry; FieldsHashMapLightweight open-addressing hashmap used for symbol tables, macros, and other small maps. FieldsHidesetRepresents a set of macro names that have been hidden to prevent recursive macro expansion. JCC_OPVM instruction opcodes for the JCC bytecode. typedef enum { #define X(NAME) NAME, OPS_X #undef X } JCC_OP; DiscussionThe VM is stack-based with an accumulator `ax`. These opcodes are emitted by the code generator and interpreted by the VM executor. JCCAsmCallbackCallback invoked when an `asm("...")` statement is encountered during code generation. typedef void ( *JCCAsmCallback)( JCC *vm, const char *asm_str, void *user_data); ParametersDiscussionThe callback may emit custom bytecode into the VM's text segment, perform logging, or otherwise handle the asm string. JCCFlagsBitwise flags for JCC runtime features and safety checks. typedef enum { // Memory safety flags (bits 0-19) JCC_BOUNDS_CHECKS = ( 1 << 0), // 0x00000001 - Array bounds checking JCC_UAF_DETECTION = ( 1 << 1), // 0x00000002 - Use-after-free detection JCC_TYPE_CHECKS = ( 1 << 2), // 0x00000004 - Runtime type checking JCC_UNINIT_DETECTION = ( 1 << 3), // 0x00000008 - Uninitialized variable detection JCC_OVERFLOW_CHECKS = ( 1 << 4), // 0x00000010 - Integer overflow detection JCC_STACK_CANARIES = ( 1 << 5), // 0x00000020 - Stack canary protection JCC_HEAP_CANARIES = ( 1 << 6), // 0x00000040 - Heap canary protection JCC_MEMORY_LEAK_DETECT = ( 1 << 7), // 0x00000080 - Memory leak detection JCC_STACK_INSTR = ( 1 << 8), // 0x00000100 - Stack variable instrumentation JCC_DANGLING_DETECT = ( 1 << 9), // 0x00000200 - Dangling pointer detection JCC_ALIGNMENT_CHECKS = ( 1 << 10), // 0x00000400 - Pointer alignment checking JCC_PROVENANCE_TRACK = ( 1 << 11), // 0x00000800 - Pointer provenance tracking JCC_INVALID_ARITH = ( 1 << 12), // 0x00001000 - Invalid pointer arithmetic detection JCC_FORMAT_STR_CHECKS = ( 1 << 13), // 0x00002000 - Format string validation JCC_RANDOM_CANARIES = ( 1 << 14), // 0x00004000 - Random canary values JCC_MEMORY_POISONING = ( 1 << 15), // 0x00008000 - Poison allocated/freed memory JCC_MEMORY_TAGGING = ( 1 << 16), // 0x00010000 - Temporal memory tagging JCC_VM_HEAP = ( 1 << 17), // 0x00020000 - Force VM-managed heap JCC_CFI = ( 1 << 18), // 0x00040000 - Control flow integrity JCC_STACK_INSTR_ERRORS = ( 1 << 19), // 0x00080000 - Stack instrumentation errors JCC_ENABLE_DEBUGGER = ( 1 << 20), // 0x00100000 - Interactive debugger // Convenience flag combinations JCC_POINTER_SANITIZER = ( JCC_BOUNDS_CHECKS | JCC_UAF_DETECTION | JCC_TYPE_CHECKS), JCC_ALL_SAFETY = 0x000FFFFF, // All safety features (bits 0-19) // VM heap is auto-enabled when any of these flags are set JCC_VM_HEAP_TRIGGERS = ( JCC_VM_HEAP | JCC_HEAP_CANARIES | JCC_MEMORY_LEAK_DETECT | JCC_UAF_DETECTION | JCC_POINTER_SANITIZER | JCC_BOUNDS_CHECKS | JCC_MEMORY_TAGGING), // Pointer validity checks JCC_POINTER_CHECKS = ( JCC_UAF_DETECTION | JCC_BOUNDS_CHECKS | JCC_DANGLING_DETECT | JCC_MEMORY_TAGGING), } JCCFlags; DiscussionThese flags control memory safety features, debugging, and runtime behavior. Flags can be combined with bitwise OR. Some flags are convenience constants that represent multiple underlying flags. LabelEntryTracks labels used for goto and labeled statements and their defined addresses in the generated text segment. typedef struct LabelEntry { char *name; // Label name (for named labels) char *unique_label; // Unique label identifier (for break/continue) long long *address; // Address in text segment where label is defined } LabelEntry; MemberMember (field) descriptor for struct and union types. typedef struct Member { struct Member *next; Type *ty; Token *tok; // for error message Token *name; int idx; int align; int offset; // Bitfield bool is_bitfield; int bit_offset; int bit_width; } Member; FieldsNodeKindKinds of AST nodes produced by the parser. typedef enum { ND_NULL_EXPR = 0, // Do nothing ND_ADD = 1, // + ND_SUB = 2, // - ND_MUL = 3, // * ND_DIV = 4, // / ND_NEG = 5, // unary - ND_MOD = 6, // % ND_BITAND = 7, // & ND_BITOR = 8, // | ND_BITXOR = 9, // ^ ND_SHL = 10, // << ND_SHR = 11, // >> ND_EQ = 12, // == ND_NE = 13, // != ND_LT = 14, // < ND_LE = 15, // <= ND_ASSIGN = 16, // = ND_COND = 17, // ?: ND_COMMA = 18, // , ND_MEMBER = 19, // . (struct member access) ND_ADDR = 20, // unary & ND_DEREF = 21, // unary * ND_NOT = 22, // ! ND_BITNOT = 23, // ~ ND_LOGAND = 24, // && ND_LOGOR = 25, // || ND_RETURN = 26, // "return" ND_IF = 27, // "if" ND_FOR = 28, // "for" or "while" ND_DO = 29, // "do" ND_SWITCH = 30, // "switch" ND_CASE = 31, // "case" ND_BLOCK = 32, // { ... } ND_GOTO = 33, // "goto" ND_GOTO_EXPR = 34, // "goto" labels-as-values ND_LABEL = 35, // Labeled statement ND_LABEL_VAL = 36, // [GNU] Labels-as-values ND_FUNCALL = 37, // Function call ND_EXPR_STMT = 38, // Expression statement ND_STMT_EXPR = 39, // Statement expression ND_VAR = 40, // Variable ND_VLA_PTR = 41, // VLA designator ND_NUM = 42, // Integer ND_CAST = 43, // Type cast ND_MEMZERO = 44, // Zero-clear a stack variable ND_ASM = 45, // "asm" ND_CAS = 46, // Atomic compare-and-swap ND_EXCH = 47, // Atomic exchange } NodeKind; PragmaMacroRepresents a pragma macro function. typedef struct PragmaMacro { char *name; // Function name Token *body_tokens; // Original token stream for function body void *compiled_fn; // Compiled function pointer (returns Node*) JCC *macro_vm; // VM instance for this macro struct PragmaMacro *next; // Next macro in list } PragmaMacro; FieldsProvenanceInfoTracks pointer provenance (origin) for validation. typedef struct ProvenanceInfo { int origin_type; // 0=HEAP, 1=STACK, 2=GLOBAL long long base; size_t size; } ProvenanceInfo; FieldsRelocationA relocation record for a global variable initializer that references another global symbol. typedef struct Relocation { struct Relocation *next; int offset; char **label; long addend; } Relocation; DiscussionEach relocation describes a pointer-sized slot within a global's initializer data that must be patched with the address of another global (or label) when codegen finalizes the data segment. ScopeRepresents a parser block scope. Two kinds of block scopes are used: one for variables/typedefs and another for tags. typedef struct Scope { struct Scope *next; // C has two block scopes; one is for variables/typedefs and // the other is for struct/union/enum tags. HashMap vars; HashMap tags; } Scope; SourceMapMaps bytecode offsets to source file locations for debugger support. typedef struct SourceMap { long long pc_offset; // Offset in text segment File *file; // Source file int line_no; // Line number in source } SourceMap; FieldsStackPtrInfoTracks stack pointer information for dangling pointer detection. typedef struct StackPtrInfo { long long bp; long long offset; size_t size; int scope_id; } StackPtrInfo; FieldsStackVarMetaUnified metadata for stack variable instrumentation. typedef struct StackVarMeta { char *name; long long bp; long long offset; Type *ty; int scope_id; int is_alive; int initialized; long long read_count; long long write_count; } StackVarMeta; Fields
StringArrayDynamic array of strings used for include paths and similar small lists. typedef struct StringArray { char **data; int capacity; int len; } StringArray; FieldsTokenToken produced by the lexer or by macro expansion. typedef struct Token { TokenKind kind; // Token kind struct Token *next; // Next token int64_t val; // If kind is TK_NUM, its value long double fval; // If kind is TK_NUM, its value char *loc; // Token location int len; // Token length Type *ty; // Used if TK_NUM or TK_STR char *str; // String literal contents including terminating '\0' File *file; // Source location char *filename; // Filename int line_no; // Line number int line_delta; // Line number bool at_bol; // True if this token is at beginning of line bool has_space; // True if this token follows a space character Hideset *hideset; // For macro expansion struct Token *origin; // If this is expanded from a macro, the original token } Token; FieldsTokenKindKinds of lexical tokens produced by the tokenizer and used by the preprocessor and parser. typedef enum { TK_IDENT, // Identifiers TK_PUNCT, // Punctuators TK_KEYWORD, // Keywords TK_STR, // String literals TK_NUM, // Numeric literals TK_PP_NUM, // Preprocessing numbers TK_EOF, // End-of-file markers } TokenKind; TypeKindKind tag for the `Type` structure describing C types. typedef enum { TY_VOID = 0, TY_BOOL = 1, TY_CHAR = 2, TY_SHORT = 3, TY_INT = 4, TY_LONG = 5, TY_FLOAT = 6, TY_DOUBLE = 7, TY_LDOUBLE = 8, TY_ENUM = 9, TY_PTR = 10, TY_FUNC = 11, TY_ARRAY = 12, TY_VLA = 13, // variable-length array TY_STRUCT = 14, TY_UNION = 15, } TypeKind; Structs and Unions
JCCEncapsulates all state for the JCC compiler and virtual machine. Instances are independent and support embedding. struct JCC { // VM Registers long long ax; // Accumulator register (integer) double fax; // Floating-point accumulator register long long *pc; // Program counter long long *bp; // Base pointer (frame pointer) long long *sp; // Stack pointer long long cycle; // Instruction cycle counter // Exit detection (for returning from main) long long *initial_sp; // Initial stack pointer (for exit detection) long long *initial_bp; // Initial base pointer (for exit detection) // Memory Segments long long *text_seg; // Text segment (bytecode instructions) long long *text_ptr; // Current write position (for code generation) long long *stack_seg; // Stack segment long long *old_text_seg; // Backup of original text segment pointer char *data_seg; // Data segment (global variables/constants) char *data_ptr; // Current write position in data segment char *heap_seg; // Heap segment (for VM malloc/free) char *heap_ptr; // Current allocation pointer (bump allocator) char *heap_end; // End of heap segment FreeBlock *free_list; // Head of free blocks list (for memory reuse) // Segregated free lists for optimized allocation // Size classes: 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, LARGE (>8192) FreeBlock *size_class_lists[12]; // One free list per size class (NUM_SIZE_CLASSES) FreeBlock *large_list; // For allocations > MAX_SMALL_ALLOC (8192) // Memory safety tracking AllocRecord *alloc_list; // List of active allocations (for leak detection) HashMap init_state; // Track initialization state of stack variables (for uninitialized detection) HashMap stack_ptrs; // Track stack pointers for dangling detection (ptr -> {bp, offset, size}) HashMap provenance; // Track pointer provenance (ptr -> {origin_type, base, size}) HashMap stack_var_meta; // Unified stack variable metadata (bp+offset -> StackVarMeta) HashMap alloc_map; // Maps base addresses to AllocHeaders (for fast pointer validation) HashMap ptr_tags; // Maps pointers to their creation generation tags (for temporal safety) // Sorted allocation array for O(log n) range queries (CHKP/CHKT performance) struct { void **addresses; // Sorted array of base addresses AllocHeader **headers; // Parallel array of headers int count; // Number of active allocations int capacity; // Allocated array capacity } sorted_allocs; // Configuration int poolsize; // Size of memory segments (bytes) int debug_vm; // Enable debug output during execution // Runtime flags (bitwise combination of JCCFlags) uint32_t flags; // JCCFlags bitfield for all safety and runtime features long long stack_canary; // Stack canary value (random if JCC_RANDOM_CANARIES set, else fixed) int in_vm_alloc; // Reentrancy guard: prevents HashMap from triggering VM heap recursion // Control Flow Integrity (shadow stack) long long *shadow_stack; // Shadow stack for return addresses (CFI) long long *shadow_sp; // Shadow stack pointer // Stack instrumentation state int current_scope_id; // Incremented for each scope entry int current_function_scope_id; // Scope ID of current function being generated long long stack_high_water; // Maximum stack usage tracking // Debugger state (enable via JCC_ENABLE_DEBUGGER flag) Breakpoint breakpoints[MAX_BREAKPOINTS]; // Breakpoint table int num_breakpoints; // Number of active breakpoints int single_step; // Single-step mode (stop after each instruction) int step_over; // Step over mode (skip function calls) int step_out; // Step out mode (run until function returns) long long *step_over_return_addr; // Return address for step over long long *step_out_bp; // Base pointer for step out int debugger_attached; // Debugger REPL is active // Source mapping for debugger (bytecode ↔ source lines) SourceMap *source_map; // Array of PC to source location mappings int source_map_count; // Number of source map entries int source_map_capacity; // Allocated capacity File *last_debug_file; // Last file during debug info emission int last_debug_line; // Last line number during debug info emission // Debug symbols for expression evaluation #ifndef MAX_DEBUG_SYMBOLS #define MAX_DEBUG_SYMBOLS 4096 #endif DebugSymbol debug_symbols[MAX_DEBUG_SYMBOLS]; // Symbol table for debugger int num_debug_symbols; // Number of symbols // Watchpoints (data breakpoints) Watchpoint watchpoints[MAX_WATCHPOINTS]; // Watchpoint table int num_watchpoints; // Number of active watchpoints // Preprocessor state bool skip_preprocess; // Skip preprocessing step HashMap macros; CondIncl *cond_incl; HashMap pragma_once; int include_next_idx; // Pragma macro system struct PragmaMacro *pragma_macros; // Linked list of compile-time macros bool compiling_pragma_macro; // True when compiling a pragma macro (skip main() requirement) // Tokenization state File *current_file; // Input file File **input_files; // A list of all input files. bool at_bol; // True if the current position is at the beginning of a line bool has_space; // True if the current position follows a space character // Parser state // All local variable instances created during parsing are Obj *locals; // accumulated to this list. // Likewise, global variables are accumulated to this list. Obj *globals; Scope *scope; // Track variable being initialized (for const initialization) Obj *initializing_var; // Points to the function object the parser is currently parsing. Obj *current_fn; // Lists of all goto statements and labels in the curent function. Node *gotos; Node *labels; // Current "goto" and "continue" jump targets. char *brk_label; char *cont_label; // Points to a node representing a switch if we are parsing // a switch statement. Otherwise, NULL. Node *current_switch; Obj *builtin_alloca; Obj *builtin_setjmp; Obj *builtin_longjmp; StringArray include_paths; StringArray system_include_paths; // System header search paths for <...> // Code generation state int label_counter; // For generating unique labels int local_offset; // Current local variable offset #ifndef MAX_CALLS #define MAX_CALLS 1024 #endif struct { long long *location; // Location in text segment to patch Obj *function; // Function to call } call_patches[MAX_CALLS]; int num_call_patches; // Function address patches for function pointers struct { long long *location; // Location of IMM operand to patch with function address Obj *function; // Function whose address to use } func_addr_patches[MAX_CALLS]; int num_func_addr_patches; #ifndef MAX_LABELS #define MAX_LABELS 256 #endif LabelEntry label_table[MAX_LABELS]; int num_labels; GotoPatch goto_patches[MAX_LABELS]; int num_goto_patches; // Inline assembly callback JCCAsmCallback asm_callback; // User-provided callback for asm statements void *asm_user_data; // User-provided context for callback // Foreign Function Interface (FFI) ForeignFunc *ffi_table; // Registry of foreign C functions int ffi_count; // Number of registered functions int ffi_capacity; // Capacity of ffi_table array // Current function being compiled (for VLA cleanup) Obj *current_codegen_fn; // Struct/union return buffer (copy-before-return approach) char *return_buffer; // Buffer for struct/union returns int return_buffer_size; // Size of return buffer // Linked programs for extern offset propagation Obj **link_progs; // Array of original program lists int link_prog_count; // Number of programs // Error handling (setjmp/longjmp for exception-like behavior) jmp_buf *error_jmp_buf; // Jump buffer for error handling (NULL = use exit()) char *error_message; // Last error message (when using longjmp) }; DiscussionThe structure contains registers, memory segments, frontend state (preprocessor, tokenizer, parser) and codegen/VM bookkeeping. All public API functions accept an `JCC *` as the first parameter. NodeRepresents a node in the parser's abstract syntax tree. struct Node { NodeKind kind; // Node kind struct Node *next; // Next node Type *ty; // Type, e.g. int or pointer to int Token *tok; // Representative token struct Node *lhs; // Left-hand side struct Node *rhs; // Right-hand side // "if" or "for" statement struct Node *cond; struct Node *then; struct Node *els; struct Node *init; struct Node *inc; // "break" and "continue" labels char *brk_label; char *cont_label; // Block or statement expression struct Node *body; // Struct member access Member *member; // Function call Type *func_ty; struct Node *args; bool pass_by_stack; Obj *ret_buffer; // Goto or labeled statement, or labels-as-values char *label; char *unique_label; struct Node *goto_next; // Switch struct Node *case_next; struct Node *default_case; // Case long begin; long end; // "asm" string literal char *asm_str; // Atomic compare-and-swap struct Node *cas_addr; struct Node *cas_old; struct Node *cas_new; // Atomic op= operators Obj *atomic_addr; struct Node *atomic_expr; // Variable Obj *var; // Numeric literal int64_t val; long double fval; }; FieldsObjRepresents a C object: either a variable (global/local) or a function. The parser and code generator use Obj for symbol and storage tracking. struct Obj { struct Obj *next; char *name; // Variable name Type *ty; // Type Token *tok; // representative token bool is_local; // local or global/function int align; // alignment // Local variable int offset; // Global variable or function bool is_function; bool is_definition; bool is_static; bool is_constexpr; // Global variable bool is_tentative; bool is_tls; char *init_data; Relocation *rel; Node *init_expr; // For constexpr: AST of initializer expression // Function bool is_inline; Obj *params; Node *body; Obj *locals; Obj *va_area; Obj *alloca_bottom; int stack_size; // Static inline function bool is_live; bool is_root; StringArray refs; // Code generation (for VM) long long code_addr; // Address in text segment where function code starts }; Fields
TypeCentral representation of a C type in the compiler. struct Type { TypeKind kind; int size; // sizeof() value int align; // alignment bool is_unsigned; // unsigned or signed bool is_atomic; // true if _Atomic bool is_const; // true if const-qualified struct Type *origin; // for type compatibility check // Pointer-to or array-of type. We intentionally use the same member // to represent pointer/array duality in C. // // In many contexts in which a pointer is expected, we examine this // member instead of "kind" member to determine whether a type is a // pointer or not. That means in many contexts "array of T" is // naturally handled as if it were "pointer to T", as required by // the C spec. struct Type *base; // Declaration Token *name; Token *name_pos; // Array int array_len; // Variable-length array Node *vla_len; // # of elements Obj *vla_size; // sizeof() value // Struct struct Member *members; bool is_flexible; bool is_packed; // Enum (tracks enum constants for code generation) EnumConstant *enum_constants; // Function type struct Type *return_ty; struct Type *params; bool is_variadic; struct Type *next; }; FieldsMacro Definitions
BreakpointRepresents a debugger breakpoint at a specific program counter location. #define MAX_BREAKPOINTS 256 ParametersSee Also MAX_BREAKPOINTSRepresents a debugger breakpoint at a specific program counter location. #define MAX_BREAKPOINTS 256 ParametersSee Also MAX_WATCHPOINTSRepresents a data breakpoint that triggers on memory access. #define MAX_WATCHPOINTS 64 Parameters
See Also WatchpointRepresents a data breakpoint that triggers on memory access. #define MAX_WATCHPOINTS 64 Parameters
See Also |