Software switches are emerging as a vital measurement vantage point in many networked systems. Sketching algorithms or sketches, provide high-fidelity approximate measurements, and appear as a promising alternative to traditional approaches such as packet sampling. However, sketches incur significant computation overhead in software switches. Existing efforts in implementing sketches in virtual switches make sacrifices on one or more of the following dimensions: performance (handling 40 Gbps line-rate packet throughput with low CPU footprint), robustness (accuracy guarantees across diverse workloads), and generality (supporting various measurement tasks). In this work, we present the design and implementation of NitroSketch, a sketching framework that systematically addresses the performance bottlenecks of sketches without sacrificing robustness and generality. Our key contribution is the careful synthesis of rigorous, yet practical solutions to reduce the number of per-packet CPU and memory operations. We implement NitroSketch on three popular software platforms (Open vSwitch-DPDK, FD.io-VPP, and BESS) and evaluate the performance. We show that accuracy is comparable to unmodified sketches while attaining up to two orders of magnitude speedup, and up to 45% reduction in CPU usage.