Engineering

Building Open-Labor-PH: Engineering the Knowledge Layer for Philippine AI

December 2024
8 min read

I am currently building HipStaff, an automated HR platform for the Philippines. The business case is simple: Philippine HR is complex, paper-heavy, and difficult to navigate.

But the engineering reality of automating it with AI turned out to be much harder than I expected.

When I started integrating Large Language Models (LLMs) to answer questions about labor regulations, I hit a wall. It wasn't a problem with my code; it was a problem with the "brain" I was using.

I realized that if I wanted an AI that could actually be trusted with Philippine compliance, I couldn't just "prompt" my way out of it. I had to build a new layer of digital infrastructure.

So, I am building Open-Labor-PH, and I am open-sourcing the core data engine.

Here is the engineering problem I am solving, and why I'm doing it in public.

The Problem: AI "Drift" and Legal Hallucinations

As I began testing models like GPT-4 and Claude 3.5 on Philippine Labor Law, two critical failure modes emerged that made them dangerous for production use.

1. The "Cutoff" Gap

Government regulations are living documents. DOLE issues new Department Orders constantly.

  • The Reality: DOLE recently updated rules on foreign national employment (DO-248) in early 2025.
  • The Failure: The major AI models were trained on data from 2023. They don't know this order exists.
  • The Consequence: A standard AI chatbot will confidently give advice that is now legally obsolete.

2. The Jurisdiction Bleed

Most LLMs are overwhelmingly trained on American data.

  • The Reality: The US follows "At-Will Employment" (fire for almost any reason). The Philippines follows "Security of Tenure" (strict Due Process required).
  • The Failure: When I asked the AI about terminating an underperforming employee, it frequently hallucinated American concepts, suggesting immediate termination strategies that would result in an Illegal Dismissal case in the Philippines.

The Solution: Data as Infrastructure

I realized that Fine-Tuning wasn't the immediate answer (it's too slow and expensive to retrain a model every time a law changes).

The answer is RAG (Retrieval Augmented Generation), but RAG is only as good as the data you feed it. And right now, Philippine Labor Law data is trapped in scanned PDFs and unstructured HTML.

I am building the parser to fix this.

The Open-Labor-PH repository is an initiative to treat government statutes like software code. I am writing the ingestion engine to:

  • Atomize the Law: Break down massive PDFs into semantic chunks (JSON).
  • Tag for Validity: Add metadata for date_promulgated and status, so the AI knows if a law has been repealed.
  • Benchmark Reality: Create a "Golden Evaluation Suite"—hundreds of QA pairs with ground-truth answers to scientifically measure if an AI is telling the truth.

Why Open Source?

I am keeping the commercial SaaS logic (HipStaff) private, but I believe the Legal Knowledge Graph should be public infrastructure.

We cannot build the next generation of Philippine tech tools if every startup has to scrape the same 500 PDFs from the DOLE website and clean them manually.

By open-sourcing the dataset and the evaluation tools, I am hoping to create a standard that other engineers, students, and researchers can use.

The Roadmap

I have just pushed the initial repo with the first set of JSON schemas and the ingestion script.

  • Phase 1: Ingestion Engine (Turning PDFs into JSON) — In Progress
  • Phase 2: The Golden Eval Suite (Benchmark Tests)
  • Phase 3: A public "Leaderboard" of which Models understand PH Law best.

If you are a dev interested in Rust/Python, or a law student interested in AI, check out the repo at github.com/jcyrus/open-labor-ph

Cyrus Espiritu is the founder of HipStaff, building modern payroll and HR software for Philippine businesses. He is also working on Open-Labor-PH, an open-source project to create structured, machine-readable Philippine labor law data.

Ready to See HipStaff in Action?

Join our founding community and get early access to Philippine-compliant payroll software that actually makes sense.

Schedule a Demo