Practical Tutorials

Compositing 101: How to Pick the Right Composite Length

Composite length is one of the most consequential and least-discussed decisions in resource estimation. Here's the field-tested framework for picking the right one.

By Ghozian Karami · · 9 min read
Compositing 101: How to Pick the Right Composite Length

Compositing is the step everyone glosses over. The lecture notes say “composite to a uniform length before kriging” and the analyst picks 2m because that’s what was on the last project. Job done.

Except composite length is one of the most consequential decisions in the whole estimation pipeline. It controls the support of your input data, the variance you’re trying to model with the variogram, and ultimately the smoothness of your block estimates. Get it wrong and the rest of the work is fighting against a poorly defined input.

This post is the practical framework. Not exhaustive geostatistics, just what to think about and what to avoid.

Why we composite at all

Raw drillhole assays are sampled at irregular intervals. A typical project has 0.5m intervals through ore zones, 1m or 2m intervals through hangingwall and footwall, and the occasional 3m interval where logging called the rock waste. Mixing those supports in a single estimation is a problem because:

  • Variance changes with support. Short samples are more variable than long ones (the support effect). A variogram fit to mixed-support data is meaningless.
  • Equal weights need equal supports. Kriging weights samples to estimate a block. If your samples are 0.5m and 3m, the 3m sample contributes more rock to the average than its kriging weight implies. The estimate becomes biased.
  • Domain control needs consistent intervals. Composite generation lets you snap intervals to lithological or weathering domain boundaries cleanly.

The fix is compositing: regenerate samples on a uniform length, weighted by interval length within each composite.

The three lengths in tension

When you’re picking a composite length, three constraints fight each other:

  1. Sample length: the average raw assay interval. Composites shorter than this are inventing data. Composites longer than about 4 to 5x average sample length over-smooth.
  2. Selective Mining Unit (SMU): the smallest practical block of ore mining can selectively extract. Composites should align with the support that mining actually delivers.
  3. Bench height: the vertical interval mining will work in. Composites should be a divisor of bench height so block models snap cleanly.
Constraint Typical range Influence on composite
Sample length 0.5m to 3m Lower bound on composite
SMU 2.5m to 10m Upper bound on composite for selectivity
Bench height 5m to 15m Composite should divide evenly into bench
Block size (kriging) 30 to 50% of drillhole spacing Composite typically 25 to 50% of block height

The honest answer is the composite length lives in the overlap zone of these three constraints. There isn’t a single right number, but there’s usually a defensible range, and within that range there’s a sensible default.

A practical decision tree

Use this when you’re staring at a fresh database and need to pick a composite length:

Q1: What's the average raw sample length?
    └─ Call this L_sample

Q2: What's the planned bench height?
    └─ Call this H_bench

Q3: What's the planned SMU height (often = bench, sometimes half-bench)?
    └─ Call this H_smu

Composite length L_comp should satisfy:

   L_sample <= L_comp <= H_smu / 2

   AND  L_comp divides evenly into H_bench (so block models snap)

   AND  L_comp >= P75 of raw sample lengths (avoid creating composites
                                              shorter than most samples)

If multiple values satisfy: pick the one closest to L_sample.
If nothing satisfies: revisit either the bench height assumption
                     or the sample protocol on this project.

Worked example: raw assay average 1.0m, P75 raw sample length 1.0m, planned bench height 5m, planned SMU 5m.

  • Lower bound: 1.0m
  • Upper bound: 5/2 = 2.5m
  • Bench divisors: 1.0m, 1.25m, 2.5m, 5.0m
  • Filter to range: 1.0m, 1.25m, 2.5m
  • Closest to L_sample: 1.0m, but that introduces no compositing benefit. Pick 2.5m as the largest defensible composite that still respects SMU/2.

A reasonable default for many porphyry and disseminated gold projects ends up at 2m or 2.5m. Bedded deposits with 5m benches often go to 2.5m. Underground vein-style projects sometimes go shorter (1.0m to 1.5m) to preserve high-grade variability.

Fixed-length versus honoring geology

Two main compositing schemes exist:

Fixed-length compositing

The drillhole is divided into uniform intervals (say 2m), starting from the collar (or the top of the first sample). Every composite gets the same length, regardless of what lithology it falls into. Some composites at domain boundaries will be mixed-lithology.

Pros: simple, consistent, easy to validate.

Cons: composites that straddle domain boundaries dilute the high-grade domain into the low-grade domain. If 30% of your composites are mixed-domain, your domain statistics will be polluted.

Honored-geology compositing (residual length)

The drillhole is divided into 2m intervals within each domain separately. At domain boundaries, the leftover residual interval is either kept short, dropped, or merged with the adjacent same-domain composite if available.

Pros: every composite belongs cleanly to one domain. Domain statistics are clean.

Cons: composite lengths aren’t perfectly uniform, so you have to decide what to do with the residuals. Composites of 0.4m at the bottom of an ore zone are often dropped (they have insufficient support and add noise).

Default: composite within domain. Mixed-domain composites are a silent data-quality hit that doesn’t show up until you fit the variogram and wonder why the sill is so high.

If you’re working from a flat-file CSV without domains coded, do the domain coding first, then composite. Don’t composite raw and try to assign domains to composites afterward. The boundary errors compound.

Handling missing intervals

Real drill data has gaps:

  • Lost core (“no recovery”)
  • Skipped samples (assayer ran out of bag)
  • Below detection limit values
  • Pre-collar overburden not sampled

For each, the compositing rule should be explicit:

  • Lost core / missing interval: the composite that contains the gap should be flagged and either rejected if the gap is more than about 25 to 30% of the composite length, or kept with a length-weighted average over only the assayed portion. Document either way.
  • BDL values: convert to a numeric placeholder before compositing (typically half the detection limit, sometimes detection limit, depending on convention). Compositing BDL as zero is a common error that biases low.
  • Pre-collar overburden: usually start the compositing from the top of bedrock, not the collar.
  • End-of-hole residuals: the last few meters often produce a short composite. Keep if it’s at least half the target length. Drop or merge with the previous composite if shorter.
# Pseudo-rule for handling residuals
if residual_length >= target_length * 0.5:
    keep_residual_as_short_composite()
elif residual_length > 0:
    if previous_composite_same_domain:
        merge_with_previous()
    else:
        drop_residual_with_log_entry()

Top-of-hole and bottom-of-hole edge cases

Two specific edge cases trip people up:

Top-of-hole: the first composite often starts in overburden or pre-collar. If the first sample doesn’t start at depth zero, the first composite should start at the first sampled depth, not depth zero. Otherwise the first composite gets padded with implicit zeros that don’t exist in the data.

Bottom-of-hole: holes ended in mineralization (not in fresh waste) often have a final partial composite. This is informative data, especially if the final composite has a high grade, because it suggests the hole stopped before the deposit ended. Keep these and flag them. They’re useful for extension drilling planning even if they’re conservative for the resource estimate.

When to NOT composite

Compositing isn’t always appropriate. Three cases where you should leave the raw data alone:

  1. Single-sample-per-interval campaigns: very short holes, scout drilling, or auger sampling where you have one sample per hole. Compositing nothing into nothing doesn’t help.
  2. Non-grade variables that aren’t length-meaningful: pH, hardness, density (sometimes), and any categorical variable. Composite grades; carry domain codes through unchanged.
  3. Geometallurgical sampling with deliberately variable support: if your sampling protocol intentionally took samples at different supports for a metallurgical reason, don’t homogenize them with arithmetic compositing. Use the original samples for met work, and composite separately for resource estimation.

A worked example: 1m raw assays to 2m composites for OK estimation

Setup: copper porphyry, 60 holes, 1m raw assays, P75 sample length 1.0m, planned bench 4m, SMU 4m. From the decision tree:

  • Lower bound: 1.0m
  • Upper bound: SMU/2 = 2.0m
  • Bench divisors: 1.0m, 2.0m, 4.0m
  • Filter to range: 1.0m, 2.0m

Pick 2.0m. Domain-honored compositing inside three coded lithologies (oxide, transition, sulfide).

# Pseudo-workflow
composites = composite_drillholes(
    assays=raw_assays,
    domains=lithology,
    target_length=2.0,
    method="honor_domain",
    residual_threshold=0.5,   # keep residuals >= 1.0m
    weighting="length_weighted",
    bdl_value=0.5 * detection_limit,
)

# Result: 1521 raw samples become 768 composites
# - 720 full-length composites (2.0m exactly)
# - 35 partial composites at domain boundaries (1.0m to 1.9m, kept)
# - 13 short residuals dropped (logged)
# - 4 mixed-domain composites flagged (logged for review)

Validation after compositing:

Check Raw assays 2m composites Note
Mean Cu (%) 0.42 0.42 Length-weighted, should match
Variance 0.31 0.21 Lower variance, expected (support effect)
Sample count 1521 768 Roughly halved, as expected
Mixed-domain count n/a 4 Acceptable, documented
Domain integrity 100% 99.5% 4 mixed-domain composites flagged

The mean grade matched (length-weighted compositing preserves global mean). The variance dropped (the whole point of compositing). The domain integrity is high enough to proceed with variogram analysis on a per-domain basis.

Validation after compositing

Always run these checks before moving to variography:

  • Mean grade preservation: composite mean should match raw mean within rounding. If it doesn’t, your length-weighting is broken or your BDL handling shifted.
  • Length distribution: composite lengths should be tightly clustered at the target length. A bimodal distribution means residual handling is inconsistent.
  • Domain purity: count of mixed-domain composites should be small (less than 5% of total).
  • Per-domain statistics: mean, variance, and count per domain should make geological sense. A vein domain with the same mean grade as the host rock means your domain definition is wrong, not your compositing.

If your composites and your raw data disagree on the global mean by more than 1 to 2%, find the bug before fitting a variogram. Don’t proceed.

Where Orebit Geotools fits in

Phase 02 (Drilling EDA) handles compositing as a standard step in the workflow:

  • Auto-suggests composite length based on raw sample distribution and inferred bench size
  • Defaults to honor-domain compositing when domain codes are present
  • Flags mixed-domain composites for review
  • Logs every residual decision (kept, dropped, merged) in an exportable CSV
  • Outputs validation table showing mean preservation, variance reduction, and domain purity per composite run

You can override every default. The auto-suggestions are a starting point, not a constraint.

For the upstream and downstream steps, see drillhole validation for data prep and variography for what comes next.

See the toolkit → · Buy on Lynk.id → (Bundle IDR 99K, single modules IDR 49K, lifetime access)

Bottom line

Composite length isn’t a default. It’s a decision. Get the sample length, SMU, and bench height in front of you. Pick a composite length that respects all three. Honor domains. Document residuals. Validate the global mean is preserved.

Then move on to the variogram knowing your input is clean. Most of the variography arguments I’ve watched senior geologists have are downstream symptoms of compositing decisions made carelessly. Get this step right and the rest of the workflow gets easier.


Working through a tricky compositing decision and want a second opinion? Email hello@orebit.id with the raw sample histogram and your bench height.

Ship faster

Try the toolkit this article uses.

Orebit Geotools — single-file HTML, works offline, no install. From CSV to resource report in one afternoon.

Explore Geotools →
# From this article:
open geotools.orebit.id
load(your_drillhole.csv)
apply(workflow_above)

# Done. Ship the report.

Keep reading