In recent years, in-network support for machine learning (ML) has improved significantly through dedicated line-rate inference architectures (e.g., Taurus). However, programming in-network ML models requires deep knowledge of ML, low-level hardware (e.g., Spatial HDL), and strict performance/hardware constraints. We introduce Homunculus, a supervised learning framework and domain-specific language (DSL) that automates this process: given a dataset and a set of performance (e.g., latency, throughput) and hardware (e.g., compute, memory) constraints, it performs full design-space exploration to generate a compliant, high-performance ML model.