"Evaluate the sub bids" sounds like one job, but a bridge package, a paving package, and a water-main package each go wrong in a different spot. The math you run is the same — normalize line items, find the scope gaps, flag the outliers — but where the money hides and which gap costs you shifts by project type. An estimator who levels bridges all day will scan a paving package and miss the thing that matters, because the failure mode lives somewhere else. The point of heavy-civil estimating software is to apply the same disciplined comparison across every package type while you keep your attention on the parts that are actually type-specific. This article walks the four big categories — structures, paving, water/utility, and transit/electrical — and names what to check on each.

Bridge and structures: the splits hide the money

Structures packages are the ones where two bids that look $200K apart are actually identical, and two that look identical are $300K apart. The reason is that the big dollars sit in items that subs split differently. Concrete, rebar, and formwork are the obvious ones, but the gap usually opens in three places.

First, falsework and temporary works. Some subs carry falsework, shoring, and forming as a discrete line; others bury it inside the concrete unit price; others assume the GC supplies it. Three bids, three conventions, and the "cheap" one is cheap because it left out the temporary structure that holds the deck up while it cures. That is not a saving, it is a scope gap that becomes a change order the week before the pour.

Second, the rebar split. Reinforcing steel can be priced as furnish-and-install, furnish-only with the GC placing, or by the pound versus by the assembly. If one sub quotes "rebar, 184 tons, installed" and another quotes "rebar furnish, 184 tons" with placement nowhere on the page, the second total is lower for a reason that will cost you later. Map both onto the same furnish-and-install footing before you compare them, or the column lies.

Third, bearings, joints, and embeds — small line counts, real dollars, and the items most often quietly excluded. What to check on a structures package: that falsework is carried by someone, that every rebar line is on the same furnish/install basis, and that the per-CY structural-concrete prices cluster. A unit-price analysis across the field catches the concrete number that is half the peer median because the sub left the forming out of it.

Highway and paving: tonnage assumptions and MOT

Paving packages fail on two things that are barely visible on the cover sheet: the tonnage math and the traffic-control line.

On tonnage, asphalt and base are priced per ton, and the ton count comes from an assumed thickness and an assumed compacted density. Two subs working off the same plan can land 8% apart on total HMA tonnage purely because one assumed 165 lb/cf and the other 150, or because one carried the full overlay thickness and the other the plan minimum. The per-ton unit prices can be nearly identical while the totals diverge, which means the spread you see is a quantity assumption, not a price difference. Check that the implied tonnage is consistent across bidders before you trust the totals — a total-bid outlier more than 20% off the field on a paving package is almost always a tonnage assumption, not a pricing edge.

On maintenance of traffic (MOT), this is the paving line that is most often underbid or excluded. Phasing, flaggers, temporary striping, signage, and barrier carry real cost and stretch across the whole job duration, so a sub who treats MOT as a small lump while everyone else itemizes it by phase is either taking a risk or planning to claim it back. Check that MOT scales with the schedule, not just the paving quantity. The other paving trap is mobilization: paving crews move equipment in waves, so a front-loaded mobilization above 10% of total deserves a question, even though some early-equipment-heavy jobs justify it.

Water and utility: testing, connections, and the unseen lines

Utility packages — water main, sanitary, storm — are deceptively linear. The pipe runs by the linear foot, the prices cluster, and it looks easy to level. The cost lives in the items that are not the pipe.

Testing and commissioning is the first. Pressure testing, disinfection, bacteriological clearance, and CCTV inspection of gravity lines are required, take time, and are routinely either lump-summed at wildly different numbers or left off entirely. A bid with no testing line is not cheaper; it is incomplete. Connections and tie-ins are the second — wet taps, connections to live main, valve and fitting assemblies. These are low in count and high in dollars, and they are where one sub carries a $40K wet-tap allowance and another carries nothing. When the work is genuinely uncertain, an allowance is the honest way to price it; the danger is comparing a bidder who carried the allowance against one who simply omitted the scope.

Third, dewatering, sheeting, and restoration — the trench-support and surface-restoration items that depend on depth and groundwater the plans only hint at. Check a utility package for: a real testing line on every bid, tie-ins and connections priced rather than assumed, and dewatering carried where the trench depth calls for it. The pipe footage is the easy part; the field separates on everything around it.

Run the free bid risk scorecard on a package →

Transit and electrical: long-lead items and allowances

Transit, signal, and electrical packages introduce two pressures the dirt trades rarely face: procurement risk and design-incomplete pricing.

Long-lead items drive these bids. Switchgear, signal cabinets, special trackwork, transformers, and controls can carry lead times measured in many months, and a sub's price is only as good as the quote behind it and the date that quote expires. Two electrical bids can be far apart entirely because one locked a current supplier quote and the other is carrying a number from last year that will not hold. Check that long-lead pricing is tied to a current, dated quote and that the sub's schedule reflects the procurement window, not just the install duration. A long-lead item with a stale price is a claim waiting to happen.

Allowances show up far more on these packages because the design is often less complete at bid time — testing and commissioning of systems, integration with existing infrastructure, and owner-furnished equipment coordination all get carried as allowances. That is legitimate, but it makes apples-to-apples comparison harder: a bid that is "low" because it carried thin allowances is not low. Check that allowance amounts are comparable across bidders and that the same items are covered. The structural risk flags still apply — a penny-priced unit item or a front-loaded mobilization means the same thing here as anywhere — but on transit and electrical the schedule and procurement story carries unusual weight.

What stays the same across every type

The type-specific traps above ride on top of a comparison process that does not change. Whatever the package, you still normalize every sub's line items onto your scope of work, build a clean tab, run a peer median per item so one wild number does not move the benchmark, and apply the four deterministic risk rules — unbalanced unit price at or below $1.00, a peer outlier above 2x or below 0.5x the item median, a total more than 20% off the field, and front-loaded mobilization above 10% of total. The six scoring dimensions — price, scope, schedule, compliance, performance, risk — are the same on a culvert as on a viaduct. What shifts by type is which dimension does the deciding: scope on structures, schedule on transit, and the cost of a missed line on everything.

Bid Reasoner runs that comparison on the sub bids you receive and works in any US state through peer-median normalization, so no government data is required to level a package. Where a state-DOT catalog helps, built-in state-DOT pay-item baselines give select states a head start — currently NY and NJ — but they are never a requirement, and the same process levels a package in any state without them. You pick a decision mode, you get a ranked recommendation with page-cited evidence behind every flag, and any override you make is logged with your reason.

The takeaway: evaluating sub bids is not one skill, it is the same skill aimed at a different failure mode each time. Know where each package type hides the money — falsework and rebar splits on structures, tonnage and MOT on paving, testing and tie-ins on utility, long-lead and allowances on transit — and the comparison gets honest. Heavy-civil estimating software exists to keep the process consistent across all of them so your judgment goes to the parts that are genuinely type-specific.