mic_none

Comparison of GIS vector file formats Source: en.wikipedia.org/wiki/Comparison_of_GIS_vector_file_formats

In geographic information systems (GIS), vector file formats are used to represent geographic features such as points, lines, and polygons, along with associated attribute data.[1] These formats are essential for storing, exchanging, and analyzing spatial data across both desktop and web-based GIS platforms. The structure and capabilities of each format — including support for coordinate reference systems, metadata, and topology — greatly affect interoperability and analytical accuracy.[2]

This article presents a comparison of popular GIS and CAD vector formats, detailing their design authorities, licensing models, technical properties, and common use cases. It is intended to help users choose the most appropriate format based on software compatibility, performance, and data exchange needs.[3]

General information

[edit]

The table below lists several widely used GIS/CAD vector formats, their managing authorities, and licensing terms.

Format Design authority License
AutoCAD DXF Autodesk Proprietary[4]
Cartesian coordinate system (XYZ)
GML Open Geospatial Consortium (OGC) / ISO Open standard[5]
MapInfo TAB Precisely (formerly MapInfoCorp) Proprietary[6]
Shapefile ESRI Hybrid (proprietary spec, documented)[7][8]
TIGER US Census Bureau Public domain[9][10]

Format descriptions

[edit]

AutoCAD DXF

[edit]

The Drawing Interchange Format (DXF) is a CAD data file format developed by Autodesk in December 1982 alongside AutoCAD 1.0 to facilitate interoperability between CAD applications. It supports ASCII (since initial release) and binary encodings (since AutoCAD R10 in 1988), uses a group-code/tagged structure, and is partially publicly documented — though newer entity types may lack full documentation.[4]

Geography Markup Language (GML)

[edit]

GML is an XML-based grammar defined by the OGC and standardized as ISO 19136:2007 (and updated ISO 19136-1:2020). It models geographic features, geometries, coordinate reference system, coverages, and sensor data. GML profiles (e.g., Simple Features) tailor subsets for specific applications. Widely used in WFS, GML underpins vendor-neutral data exchange.[5]

MapInfo TAB

[edit]

MapInfo TAB is a proprietary GIS vector format created by MapInfo (now Precisely). A dataset includes .TAB (structure), .DAT (attributes), .MAP (geometry), and optional .ID/.IND index files. It supports points, lines, polygons, text annotations, spatial indexing, and multiple coordinate systems; common in MapInfo Pro and supported via GDAL/OGR.[6]

Shapefile

[edit]

Introduced by ESRI in 1998, the Shapefile format comprises .shp (geometry), .shx (spatial index), .dbf (attribute table), and optional .prj/.cpg files. It supports simple features: point, polyline, polygon, and multipoint. While popular and widely supported, it has limitations — DBF attribute format, 2 GB file size limit, no topology, limited Unicode/field name length.[7][8]

TIGER

[edit]

The Topologically Integrated Geographic Encoding and Referencing (TIGER) format was developed by the US Census Bureau to represent geographic features (roads, boundaries, water) for census purposes. Distributed as public domain TIGER/Line shapefiles, TIGER data includes geographic codes (GEOIDs) that can be linked to demographic data.[9][10]

Technical comparison

[edit]
  • Encoding: DXF (ASCII/binary), GML (XML), Shapefile (binary), TAB (mixed), TIGER (Shapefile)
  • Geometry types: DXF (all CAD entities), GML (points, lines, polygons, coverages), Shapefile (simple features), TAB (all plus text), TIGER (simple)
  • Metadata/projections: DXF (limited), GML (rich with CRS), Shapefile (.prj optional), TAB (built-in), TIGER (.prj, GEOID codes)
  • Limitations: DXF partial spec; Shapefile – 2 GB, no topology; GML verbose; TAB proprietary; TIGER → census-specific

Usage and support

[edit]
  • DXF – used in CAD–GIS handoffs; supported by AutoCAD, LibreCAD, QGIS.
  • GML – used in OGC Web Feature Services (WFS); supported by QGIS, GeoServer, ArcGIS.
  • TAB – native to MapInfo Pro; supported via GDAL/OGR.
  • Shapefile – ubiquitous in QGIS/ArcGIS, GDAL/OGR, and libraries like Shapely.
  • TIGER – consumed in US demographic mapping; downloadable via Census FTP.

Advantages and limitations

[edit]
DXF
Advantages: CAD interoperability, human-readable ASCII
Disadvantages: partial specification, large file size
GML
Advantages: extensible, CRS/metadata support, open standard
Disadvantages: verbose, steep learning curve
TAB
Advantages: indexed, fast spatial queries
Disadvantages: proprietary format
Shapefile
Advantages: simple, widely supported
Disadvantages: attribute/Unicode limits, no topology
TIGER
Advantages: free, Census-ready
Disadvantages: US-specific, not fully general-purpose

References

[edit]
  1. ^ Bolstad, Paul (2016). GIS Fundamentals: A First Text on Geographic Information Systems (5th ed.). Eider Press. ISBN 9781506695877.
  2. ^ Longley, Paul A., Goodchild, M. F., Maguire, D. J., & Rhind, D. W. (2015). Geographic Information Science and Systems (4th ed.). Wiley. ISBN 9781118676954.
  3. ^ GDAL/OGR Vector Drivers Overview. Retrieved June 2025 from https://gdal.org/drivers/vector/index.html
  4. ^ a b EzDXF documentation: "DXF was originally introduced in December 1982…to provide an exact representation….” FileFormat.com: "DXF uses group codes…”
  5. ^ a b OGC GIS standard and ISO 19136-1:2020 for GML, defining XML schema for transport/storage of geographic information
  6. ^ a b Library of Congress: “MapInfo dataset consists of… .TAB, .DAT, .MAP files.” Spotzi help: TAB links .DAT/.MAP/.ID into dataset
  7. ^ a b Library of Congress: "The ESRI Shapefile format… stores nontopological geometry and attribute information…"
  8. ^ a b Esri ArcMap documentation: "Shapefiles make use of the dBASE file format… Unicode lacks…"
  9. ^ a b US Census Bureau: TIGER/Line shapefiles include geographic entity codes linked to demographic data
  10. ^ a b Wikipedia: TIGER format used by US Census to describe physical/cultural features; public domain