Scanned P&IDs vs CAD Exports. What to Expect.
How drawing source affects instrument extraction quality. Scanned brownfield P&IDs vs CAD-exported drawings. Resolution, readability, and getting clean results.
Most P&ID extraction happens on brownfield projects. The drawings already exist. The question is what condition they're in.
A clean CAD export and a 20-year-old scan of a hand-drafted drawing are both "PDFs," but they behave completely differently when you're trying to pull instrument data out of them. Here's what actually matters.
Vector PDFs vs raster scans
The single biggest factor in extraction quality is whether your PDF contains vector graphics or raster images.
| Vector, CAD export | Raster, scanned | |
|---|---|---|
| Source | Exported directly from AutoCAD, MicroStation, SmartPlant | Flatbed scanner, photo, or print-to-PDF of a photocopy |
| Text | Selectable, searchable | Embedded in image pixels |
| Zoom behavior | Stays crisp at any zoom | Gets blurry when zoomed |
| Tag readability | Characters are exact | Subject to scan artifacts, smudging, fading |
| Typical file size | 200KB, 2MB per page | 5MB, 50MB per page |
| Extraction quality | Strong, text is unambiguous | Varies, depends on scan condition |
Quick test. Open your PDF and try to select text with your cursor. If you can highlight individual tag numbers, it's vector. If you can only select the whole page as an image, it's a scan.
There is also a third category that causes its own problems. The vector-converted-to-raster PDF. This happens when a drafter exports a CAD drawing to PDF and then someone prints and re-scans that PDF, or when a document management system rasterizes on ingest. The result looks like a scan but may have originated as clean vector. You lose all the text-selection benefits without gaining any of the hand-drafting clarity.
What degrades scanned drawings
Not all scans are equal. These are the specific issues that make instrument tags harder to read.
Resolution. Anything below 200 DPI starts losing fine detail in tag text. ISA bubbles are small. At 150 DPI, the difference between "FIT" and "FLT" or between "101" and "1O1" gets ambiguous. 300 DPI is the practical standard for readable scans. At 600 DPI you gain almost nothing useful for text recognition but your file size doubles.
Contrast. Faded scanned drawings, yellowed paper, or low-contrast photocopies reduce the distinction between text and background. The classic failure mode is a light pencil annotation on a blue background, nearly invisible in a scan.
Skew and rotation. Drawings fed through a sheet scanner at a slight angle produce rotated text. A 2-3 degree skew is common and usually manageable. Beyond 5 degrees, tag text starts getting misread. Most sheet-fed scanners introduce some skew. Flatbed scanners are more accurate but require individual sheet placement.
Compression artifacts. Some document management systems re-compress PDFs for storage. Each compression cycle degrades image quality. If a drawing has been scanned, uploaded to a DMS, downloaded, emailed, and re-uploaded, it may have been compressed three or four times. Block artifacts around fine lines and text edges are the visible symptom.
Annotations and markups. Red-line markups, cloud revisions, and sticky notes layered on top of instrument bubbles can obscure tag numbers. Hand-written corrections next to printed tags create ambiguity about which value is current.
Microfilm and aperture card reproductions. Some drawing vaults hold originals only on microfilm. Microfilm scans add grain, reduced dynamic range, and sometimes halation around dense ink areas. These still produce results, but expect more tags to need manual verification on small text than a direct paper scan at equivalent DPI.
Side-by-side workflow comparison
| Step | Vector PDF | 300 DPI scanned PDF | Photo, camera image |
|---|---|---|---|
| Pre-processing needed | None | Deskew if >3 degrees, split large files | Crop, straighten, increase contrast before upload |
| Tag extraction | Direct from text layer | From rendered page image | From re-encoded image |
| Review effort | Minimal. Spot-check output | Focus on degraded areas and small text | Expect 20-40% of tags to need manual correction |
| Recommended for | Any project where originals exist in CAD | Standard brownfield work | Last resort. Re-scan from originals if possible |
File preparation before upload
Getting the source files right before extraction saves more time than any post-processing step.
DPI thresholds. 300 DPI is the minimum for reliable extraction. Scan at 300 DPI grayscale or 300 DPI black-and-white for standard ink-on-paper drawings. Do not scan higher than 400 DPI unless you have a specific reason. You add file size without improving text readability.
Black-and-white vs grayscale. For drawings with black ink on white paper, true black-and-white, 1-bit, thresholded mode produces the sharpest text edges. Grayscale retains shading, which is useful for drawings with pencil annotations or faded ink that would be lost to thresholding. Color scanning adds file size with no benefit for instrument extraction.
Deskew before upload. If your scanner software has auto-deskew, enable it. If not, a free tool like ScanTailor or Adobe Acrobat's deskew function handles it. Drawings with more than 3 degrees of skew extract poorly. Drawings with more than 5 degrees should be deskewed manually before processing.
Split large multi-page files. Large drawing sets sometimes arrive as one 200-page PDF. Splitting into per-drawing PDFs, one drawing per file makes extraction and review more tractable. Most PDFs can be split with Adobe Acrobat, Foxit, or a command-line tool like pdftk.
Contrast adjustment for faded drawings. If a drawing has low contrast, grey ink rather than black, yellowed paper, adjust the levels before scanning. In scanner software this is usually a "brightness and contrast" or "levels" setting. For already-scanned images, a grayscale levels adjustment in any image editor before converting to PDF is sufficient.
Handling hand-mark-up annotations on field prints
Field-printed drawings that went into the plant during commissioning often come back with hand-written annotations. Calibration data pencilled next to transmitters, revised tag numbers in red ink, cable numbers added by hand, instrument addresses scrawled beside bubbles.
These annotations are part of the as-built record. The right approach.
- Keep them in the scan. Do not erase or mask annotations before scanning. They capture field changes that may not be in any other document.
- During extraction review, treat the printed tag as authoritative for ISA identification and the hand annotation as supplementary data. If the annotation changes the tag number, a common redline convention, capture both. The original tag and the redlined replacement.
- Use the revision comparison workflow to reconcile annotated field copies against a clean re-issued revision. For the full digitization workflow that handles annotation reconciliation as part of a brownfield drawing program, see the brownfield engineering guide. The difference set tells you which field changes were officially incorporated and which are still informal.
Do not try to clean up a drawing by painting over annotations before scanning. You may be destroying the only record of a field modification.
When to re-scan vs work with what you have
Re-scan when. You have the original paper drawings available, the existing scan is below 200 DPI, or contrast is so low that tag text is visibly illegible to the naked eye.
Work with what you have when. Originals are no longer accessible, the drawings are microfilm-based and further scanning would not improve resolution, or the existing scan is 300 DPI or better and the only issue is a few low-confidence tags in degraded areas.
If you are uncertain, process a representative page and check how many tags need manual correction. If more than 30% of tags on a page need manual checking, the source quality is borderline and re-scanning is worth the effort.
What actually helps
If you have control over how drawings are prepared before extraction.
Re-export from CAD if possible. If the original DWG, DGN files still exist, a fresh PDF export will always produce better results than any scan. Even if the drawings are old, the vector data is still clean. The P&ID digitization guide covers how to set up a drawing set for best results across both vector and scanned sources, including what to check before uploading a mixed-source package.
Scan at 300 DPI minimum. If you must scan, 300 DPI grayscale or black-and-white produces the best balance of quality and file size. Color scans are larger but don't improve text readability.
Use black-and-white mode for line drawings. Color scanning picks up paper yellowing, coffee stains, and background noise. B&W thresholding cleans all of that out and produces sharper text edges.
Flatten markups before scanning. If the drawing has red-line revisions, either accept, flatten them in the CAD source or scan without the markup overlay. Mixed layers confuse extraction.
Don't re-compress. Save scans as PDF with no additional JPEG compression. If your scanner software has a "quality" slider, set it to maximum.
Mixed drawing sets
Real projects rarely have uniform drawing quality. A typical brownfield set might include.
- 30 pages of clean CAD exports from a recent turnaround
- 15 pages scanned from the original 1990s construction package
- 5 pages that are photos of laminated control room copies
Each page type will produce different output quality. The key is knowing which pages need more attention during review. Vector pages need minimal checking. Degraded-scan pages are where review effort concentrates.
Format recommendations by source
| Drawing source | Recommended preparation | Expected quality |
|---|---|---|
| Current CAD system, AutoCAD, MicroStation | Export as vector PDF, no rasterization | Excellent |
| SmartPlant P&ID, AVEVA | Native PDF export | Excellent |
| Bluebeam project | Export or print to PDF | Excellent |
| Recent scan, < 5 years, 300 DPI | Use as-is | Good |
| Old scan, > 10 years, unknown DPI | Re-scan at 300 DPI if originals available | Fair to Good |
| Microfilm or aperture card scan | Re-scan at highest available DPI, B&W mode | Fair |
| Photo of printed drawing | Crop, straighten, increase contrast | Poor to Fair |
Related
- Brownfield P&ID digitization
- Revision comparison workflow
- Reading legacy 1980s P&IDs
- Commissioning loop check plan
FAQ
What DPI should I use when scanning P&IDs.
300 DPI is the minimum for reliable instrument-tag recognition. A drawing scanned at 200 DPI will still produce results, but expect more tags to need manual checking on small text, especially ISA bubbles with three-letter function codes. Scan at 300 DPI grayscale for drawings with pencil annotations. Use 300 DPI black-and-white for clean ink drawings. Anything above 400 DPI adds file size without improving text readability for standard instrument tags.
My drawings exist only as microfilm scans. Are they usable.
Yes, with caveats. Microfilm scans introduce grain and reduced dynamic range that make small text harder to read. Use the highest DPI your microfilm reader supports, typically 400 DPI for 16mm, 200-300 DPI for 35mm aperture cards. After scanning, a contrast enhancement pass in any image editor before generating the PDF helps considerably. Expect to spend more time in review on microfilm-sourced pages than on paper scans.
Can I use a photo taken on a phone.
Phone photos of drawings produce results, but with significantly higher review burden than scanned PDFs. The common problems are perspective distortion, the phone was not held perfectly parallel to the drawing, variable focus across the page, and insufficient resolution in the corners. If you must use photos, take them in good light, hold the phone parallel to the drawing surface, and crop tightly to the drawing border. A photo at 1.5 meters from an A0 drawing produces roughly 200 DPI equivalent resolution, which is below ideal.
The same drawing exists both as a CAD export and a scan. Which should I use.
Always the CAD export. A vector PDF from a CAD tool gives exact text strings for every tag, and no OCR step is needed. The scanned version degrades over time and cannot be improved without the original paper. The only reason to prefer the scan over the CAD export is if the CAD file does not reflect subsequent field changes that are captured on the scanned as-built redline.
How should I handle a drawing set where some pages are vector and some are scanned.
Upload the complete set as one project. Mixed-source drawing sets are handled per page. Vector pages produce clean output. Scanned pages produce output that needs closer review. Use the review interface to focus on those pages rather than reviewing every page uniformly.