ARTICLE AD BOX
I’m working with the Ceremonial County Boundaries of England shapefile available here: https://www.data.gov.uk/dataset/0fb911e4-ca3a-4553-9136-c4fb069546f9/ceremonial-county-boundaries-of-england
The shapefile is in British National Grid (EPSG:27700).
I also have a DataFrame of locations with latitude/longitude (WGS84). I want to determine which ceremonial county each point falls into.
For most counties, the spatial join works correctly. However, for Cornwall, Greater London, and Rutland, the join always returns NaN, even though when plotted the points are clearly inside the county boundary.
Code
import geopandas as gpd from shapely.geometry import Point regions = gpd.read_file("Ceremonial_County_Boundaries.shp") points = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df["lon"], df["lat"]) points_bng = points.to_crs(regions.crs) # Spatial join joined = gpd.sjoin(points_bng, regions, how="left", predicate="within")Whats going wrong The spatial join correctly matches ~90% of the locations. But all points in Cornwall, Greater London, and Rutland come back as unmatched (index_right = NaN).
I isolated the polygons from the shapefile:
county_poly = regions[regions["NAME"] == "Cornwall"] print("Valid:", county_poly.geometry.is_valid.unique())Output:
Valid: [False]And the same happens for Greater London and Rutland — all three return invalid geometries.
Every other county returns Valid: [True] and works correctly.
What i have tried
Confirmed visually that points lie inside the polygonPlotting shows the points visibly inside the Cornwall/London/Rutland boundaries.
CRS is correctShapefile: EPSG:27700
Points: EPSG:4326 → converted to 27700
Any help or thoughts would be appreciated?
