announcements

Specifically for meta-information about the site

Intro to GIS with QGIS Tutorial Updated

I posted the latest version of my Introduction to GIS with QGIS tutorial manual, updated for QGIS 3.44 Solothurn. The manual and sample data are freely available from my lab’s tutorials page.

This year’s changes are minor, as there weren’t major software updates that would impact the exercises. I updated the sample data as it was growing a bit stale, which necessitated updating screenshots and references in the text. I also updated the source LaTeX code to ensure the PDF meets the new federal WCAG guidelines for digital accessibility. I added a brief troubleshooting section on using QGIS with a Mac to the Introduction, as there have been a creeping number of annoying problems that have thrown my workshops off kilter (MacOS security blocking installation and writing of files to the user’s documents folder, and UI issues with the default color scheme and hiding menus in the background).

It’s hard to believe this is the 16th edition of this manual. When I wrote the first version back in 2011 (it was originally called Introduction to GIS Using Open Source Software), there was relatively little documentation on QGIS. In launching a workshop series and on-going support for it, I felt that I needed to create a basic guidebook. QGIS was far more primitive back then; converting the CRS for a GIS file required a separate command-line program, and you had to calculate natural breaks by hand! QGIS has certainly come a long way since then, and has been widely adopted. I’ve updated my materials as the software has evolved, but overall I’ve stuck to the same plan in terms of content, with modifications to the examples:

Introduction: a general overview of the workbook, goals for the material and workshop, and updates to the manual since the previous version.
An Overview of GIS: a short narrative that describes basic GIS concepts, and open source software.
Exploring the Interface: explanation of the interface, adding vector data and viewing and selecting features, adding raster data, web base maps, and project files.
Geographic Analysis: a cohesive case study that illustrates the process for doing an analysis while showcasing fundamental operations including: tables joins, plotting coordinate data, selecting, filtering, and deleting features, and geoprocessing tools like intersection and buffering. My original case study was locating a new comic book store in NYC, subsequently modified to locating a coffee shop, and more recently identifying public libraries in Rhode Island that met criteria for a grant to host an after school program.
Thematic Mapping: exercise for understanding coordinate reference systems and how they function in QGIS, transforming systems, classifying data, and creating a final map layout. The original example was mapping healthcare sector employment by US state, but eventually switched to voter participation in federal elections.
Data and Educational Resources: strategies and suggestions for finding GIS data, and recommendations for learning more through tutorials and workshops.

QGIS Interface — QGIS 3.44 Solothurn interface in Windows 11, used for the 16th edition of the manual in 2026

I’ve always used a follow-the-leader approach in teaching the workshop, where we stay together and the group does the material step by step. Everyone is a little fatigued by the end of the day, so for the thematic mapping piece we pivot to using the workbook; I demo the material and folks work on their own. I primarily wrote the book as a takeaway, so people could refer back to it after the session was over. It was also useful for self-directed learning, and the act of writing and updating it helps me internalize the material.

The last couple of years have been challenging though, and I have been thinking about what I could do differently. There have been a number of articles recently that discuss declining attention spans and students’ diminishing ability to thoughtfully engage with text (for example, in The Economist and The Atlantic). My own experiences bear this out. Increasingly, participants have a hard time following me in the session, and when I have student employees do the tutorial on their own as part of orientation, it takes them much longer and they struggle to finish. I have other workshops where I don’t use the follow-the leader approach, and instead I demo the material and everyone works on their own using the text, and once everyone finishes we move on to the next part. This works no better, as many participants wander away (mentally and physically) from the exercises and struggle to relate the text to what’s on the screen. Now, a good deal of success hinges on having good instructions, but I update them every year, and have used the same system and material in earlier years to good effect.

I have considered creating a more interactive web version of the workbook, or creating videos. The problem with both is that updating them each year would be far more time consuming; in contrast, updating the text and images in the workbook is pretty straightforward. I had one colleague suggest that my “old school” workbook fills an important niche in terms of structure and format, and is useful “as is” for deep learners. A professor I work with, who has had similar experiences in the classroom, expressed that there’s a limit to what an instructor can do; we can only simplify the material so much, and it’s up to students to rise to the occasion.

Reflecting on how I learn, I have gradually moved to videos to learn new material: whether it’s understanding a particular GIS method or tool, mastering a video game, or figuring out how to replace the belt on my clothes drier. Nothing beats a book for getting a comprehensive introduction to a topic, but the videos are helpful for targeted, stand-alone tasks. They can be comprehensive too, if they’re thoughtfully designed as part of a series. I’m definitely going to keep the workbook manual going, but am considering a video version for the future. I’ll wait until the 4.x version of QGIS becomes the new long term release (early next year) before I head in that direction, as the jump from 3.x to 4.x is more likely to come with some interface and functional changes.

The workbook has gotten a lot of miles these past 16 years, and I know others have used it for their own workshops and courses. Feel free to reach out if you have any suggestions.

Old QGIS Interface — QGIS 1.5 Tethys interface in Windows XP, used for the 1st edition of the manual in 2011

Talks and Travels: Conferences and the Camino

I haven’t been keeping up with posting, as these past few months have been atypical. April was devoted to attending conferences and giving presentations. Much of this was prompted by my recent work with the Data Rescue Project, and the HIFLD Open rescue initiative in particular. The month began with a panel at Brown’s Data Science Institute, where the topic was Trust in Data. A few days later, I joined local colleagues at the Northeast Higher Ed GIS Facilitators Meet Up in Worcester, MA. I was honored to serve as the keynote speaker for the annual Big 10 GIS Conference (held virtually), where I presented on preserving federal datasets and the HIFLD Open rescue initiative. Shortly thereafter, I traveled to the Census Bureau’s headquarters just outside of DC for FedGeoDay 2026 and served on a panel of non-federal data providers who are contributing to the national data ecosystems. I came back to Providence just in time to give a poster presentation on our GIS and Data Services at the CHAIRS-C conference (Center on Heat, Health, and Aging Innovation and Research Solutions for Communities at the Brown University School of Public Health).

Then in May, I went off the grid. My wife and I traveled to Northwestern Spain to walk the Camino de Santiago, or Way of St. James. Established in the Middle Ages, the Camino is a series of routes that drew Christian pilgrims from throughout Europe to the Cathedral in Santiago de Compestella, which is believed to be the resting place of the apostle St James, brother of St John. We walked the Camino Primitivo, which is the “original” route established by King Alfonso II circa 814 AD. It’s also considered to be the most challenging of the routes, as it climbs through mountains and forests before descending into farmland. It’s not considered a “wilderness” hike however, as all of the routes follow a mix of unpaved and paved roads between towns and villages. A map of the primary routes (from Wikipedia) is below. The French Way is considered the primary route and is the most heavily traveled. The Portuguese and Northern Ways are also popular, followed by the Primitivo.

Map of the Primary Camino Routes — By WikiPate – credit to Sémhur, and for the logo: Manfred Zentgraf, CC BY-SA 4.0, Map from Wikimedia

The paths are marked at regular intervals, and whenever you have to turn or change direction. You look for a white or grey stone marker with a scallop shell, a symbol of St. James and of the Camino, to guide you. In places where a marker stone isn’t feasible, a blue and yellow tile of the shell is embedded in a wall or building to point the way. The system was so good that we rarely needed our phones; we used nothing more than our eyes and an excellent guidebook with detailed topographic maps of each stage of the journey, elevation diagrams, and a directory of landmarks and places to stay (the Village to Village Camino Guides – highly recommended).

The routes run into the hundreds of kilometers; the Primitivo is a shorter trek that covers about 320 km between Oviedo and Santiago, but in exchange for the shorter distances you have greater changes in elevation. As this was our first attempt at something like this, we opted to do half the route, beginning in a town called Grandas de Salime, located at a large dam and reservoir as you leave the region of Asturias and enter Galicia.

Our starting point: view of the Embalse de Salime from the Hotel Las Grandas

Accommodations and cafes serve pilgrims throughout the route; albergues offer a mix of hostel-like rooms (bunk beds in shared rooms) and single rooms, and there are also basic hotels. Our 183 km walk took us 9 days (3 days walking / 1 rest day in Lugo / 5 days walking). You are issued a pilgrim’s credential or passport when you begin, which grants you access to pilgrim-reserved accommodations and resources. On your journey, you need to get your passport stamped twice a day to verify that you are doing the walk. You always get a stamp at the places you stay, and in-between you can pick up others at cafes and restaurants, churches, museums, visitor centers, and even certain stores (we managed to get one at a cheese shop). Once you reach the cathedral in Santiago, you visit the pilgrim’s office (essentially the Camino DMV), where you present your passport to receive the Compostella, the official document that certifies that you finished the pilgrimage. You need to walk 100km minimum (200km if you’re biking) to qualify.

Camino Passport Stamps — A Selection of Stamps from my Pilgrim’s Credential or Passport

It was a deeply moving experience, retracing the steps that countless pilgrims took over a thousand years, and ending in front of the statue and tomb of St James behind the high altar in the cathedral, receiving the Eucharist at the Pilgrim’s mass. It was a relief to disconnect from technology and work, boiling life down to the singular goal of getting from point A to B each day. It was physically satisfying, pushing my body to walk 10 to 20 miles a day in rough terrain in all kinds of weather. It was wonderful to meet new friends; there is a cohort of people who happen to begin their journey simultaneously with you, and you see them throughout the walk, sharing the road for a time or a meal at the end of the day at the albergue. And it was a lot of fun, for a geographer who enjoys navigating a landscape with no digital do-dads, and who loves collecting stamps!

An experience like this alters your perspective, and it’s been difficult to transition back to my normal routines. It has strengthened my belief that it’s time for me to consider new possibilities and next stages in my career. Please reach out (via LinkedIn or email in the sidebar) to share opportunities. My resume is available on the About page.

Stay tuned for some heat and climate-related dataset suggestions in my next post; resources I compiled for the heat conference, and new ones I’ve learned about at FedGeoDay.

HIFLD Next GIS Data Catalog

Last December, the Data Rescue Project (DRP) finished an initiative to download and archive over 400 geospatial data layers from the defunct HIFLD Open repository to DataLumos, an ICPSR-sponsored repository for federal government datasets. I wrote this brief post that summarized our work.

The Public Environmental Data Partners and Fulton Ring have launched a new community-shaped hub for finding, previewing and downloading GIS data collections, and its debut HIFLD Next collection is built on this rescued dataset. The portal increases the accessibility of the HIFLD Open data, with enhanced options for searching, previewing attribute tables and layers, and downloading or streaming data in a number of different formats. Beyond publishing a website, the group is hoping to build a community of practitioners around this project to support and sustain it, and to provide updated datasets and additional collections in the future.

They are keen to solicit feedback from GIS data users, and particularly from librarians and data specialists who provide active user support and who would potentially refer to the portal as a source. After you’ve explored the portal, feel free to submit feedback via their survey.

To learn more about the project, you can read this press release from the PEDP and this announcement from the DRP. The project is likely to be a primary topic of discussion at FedGeoDay 2026, which takes place in late April in Washington DC.

HIFLD Open Data Archived in DataLumos

Some good news to end the year: the Data Rescue Project has finished archiving all of the GIS data layers that were in the HIFLD Open portal, which was decommissioned at the end of summer. I wrote a post for the DRP that summarized the work we did, and you can find all the layers in ICPSR’s DataLumos repository, where you can search for and download layers one by one. I also archived the index for the series and a crosswalk that DHS published for locating updated versions of the data from the individual federal agencies that created them. If you wanted to download the entire set in bulk, it can be transferred from the Brown University Library’s GLOBUS endpoint; there are instructions for doing this on our library’s Data Rescue GitHub repo.

This project was an archival one, in that we were taking a final snapshot of what was in the repository before it went offline. In the coming year, I’ll be thinking about approaches for consistently capturing updates, and there are some folks who are interested in creating a community-driven portal to replace the defunct government site. Stay tuned!

2025 has been a tough year. Wishing you all the best for the year to come. – Frank

DataLumos HIFLD Open Archive

HIFLD Open GIS Portal Shuts Down Aug 26 2025

HIFLD Open, a key repository for accessing US GIS datasets on infrastructure, is shutting down on August 26, 2025. This is a revision from a previous announcement, which said that it would be live until at least Sept 30. The portal provided national layers for schools, power lines, flood plains, and more from one convenient location. DHS provides no sensible explanation for dismantling it, other than saying that hosting the site is no longer a priority for their mission (here’s a copy of an official announcement). In other words, “Public domain data for community preparedness, resiliency, research, and more” is no longer a DHS priority.

The 300 plus datasets in Open HIFLD are largely created and hosted by other agencies, and Open HIFLD was aggregating different feeds into one portal. So, much of the data will still be accessible from the original sources. It will just be harder to find.

DHS has published a crosswalk with links to alternative portals and the source feeds for each dataset, so you can access most of the data once Open HIFLD goes offline. I’ve saved a copy here, in case it also disappears. Most of these sources use ESRI REST APIs. Using ArcGIS Online or Pro, and even QGIS (for example), you can connect to these feeds, get a listing in your contents pane, and drag and drop layers into a project (many of the layers are also available via ArcGIS Online or the Living Atlas if you’re using Arc). Once you’ve added a layer to a project, you can export and save local copies.

QGIS ESRI Rest Services — Adding ArcGIS Rest Server for US Army Corps of Engineers Data in QGIS

If you want to download copies directly from Open HIFLD before it vanishes on Aug 26, I’ve created this spreadsheet with direct links to download pages, and to metadata records when available (some datasets don’t have metadata, and the links will bring you to an empty placeholder). Some datasets have multiple layers, and you’ll need to click on each one in a list to get to it’s download page. In some cases there won’t be a direct download link, and you’ll need to go to the source (a useful exercise, as you’ll need to remember where it is in the future). Alternatively, you can connect to the REST server (before Aug 26, 2025) in QGIS or ArcGIS, drag and drop the layers you want, and then export:

https://services1.arcgis.com/Hp6G80Pky0om7QvQ/ArcGIS/rest/services

I’m coordinating with the Data Rescue Project, and we’re working on downloading copies of everything on Open HIFLD and hosting it elsewhere. I’ll provide an update once this work is complete. Even though most of these datasets will still be available from the original sources, better safe than sorry. There’s no telling what could disappear tomorrow.

The secure HIFLD site for registered users will remain available, but many of the open layers aren’t being migrated there (see the crosswalk for details). The secure site is available to DHS partners, and there are restrictions on who can get an account. It’s not exactly clear what they are, but it seems unlikely that most Open users will be eligible: “These instructions [for accessing a secure account] are for non-DHS users who support a homeland security or homeland defense mission AND whose role requires access to the Geospatial Information Infrastructure (GII) and/or any geospatial dashboards, data, or tools housed on the GII…“

Rescuing US Government Data

There’s been a lot of turmoil emanating from Washington DC lately. One development that’s been more under the radar than others has been the modification or removal of US federal government datasets from the internet (for some news, see these articles in the New Yorker, Salon, Forbes, and CEN). In some cases, this is the intentional scrubbing or deletion of datasets that focus on topics the current administration doesn’t particularly like, such as climate and public health. In other cases, the dismemberment of agencies and bureaus makes data unavailable, as there’s no one left to maintain or administer it. While most government data is still available via functioning portals, most of the faculty and researchers I work with can identify at least a few series they rely on that have disappeared.

Librarians, archivists, researchers, professors, and non-profits across the country (and even in other parts of the world), have established rescue projects, where they are actively downloading and saving data in repositories. I’ve been participating in these efforts since January, and will outline some of the initiatives in this post.

The Internet Archive

The place of last resort for finding deleted web content is the Internet Archive. This large, non-profit project has been around as long as the web has existed, with the goal of creating a historic archive of the internet. It uses web crawlers or spiders to creep across the web and make copies of websites. With the Wayback Machine, you can enter a URL and find previous copies of web pages, including sites that no longer exist. You’re presented with a calendar page where you can scroll by year and month to select a date when a page was captured, which opens up a copy.

A Wayback Machine search for https://tools.niehs.nih.gov/cchhl/index.cfm. Blue circles on the calendar indicate when the page was captured.

This allows you to see the content, navigate through the old website, and in many cases download files that were stored on those pages. It’s a great resource, but it can’t capture everything; given the variety and complexity of web pages and evolving web technologies, some websites can’t be saved in working order (either partially or entirely). Content that was generated and presented dynamically with JavaScript, or was pulled and presented from a database, is often not preserved, as are restricted pages that required log-ins.

An archived copy of the NIEHS page (the actual website was deleted in mid February 2025)

The Internet Archive also hosts a number of special collections where folks have saved documents, images, sound and video, and software. For example, you can find many research articles that are available in PubMed from the PubMed Central collection, a ton of documents from the USDA’s National Agricultural Library, and about 100 GB of data someone captured from the CDC in January 2025. A large project called the End of Term Archive was launched in 2008 to capture what federal government websites looked like at the end of each presidential term. The pages are saved in a special collection in the IA.

Data Rescue Project

Dozens of new data archiving projects were launched at the end of 2024 and beginning of 2025 with the intention of saving federal datasets. The Data Rescue Project is one of the larger efforts, which has been driven by data librarians and archivists with non-profit partners. Professional groups including IASSIST, ICPSR, RDAP, the Data Curation Network, and the Safeguarding Research & Culture project have been active organizers and participators. While this will be an oversimplification, I’ll summarize the project as having two goals

The first goal is to keep track of what the other archiving projects are, and what they have saved. To this end, they created the Data Rescue Tracker, which has two modules. The Downloads List is an archive of datasets that have been saved, with details about where the data came from and locations of archived copies. The Maintainers List is a catalog of all the different preservation projects, with links to their home pages. There is also a narrative page with a comprehensive list of links to the various rescue efforts, data repositories, alternate sources for government data, and tools and resources you can use to save and archive data.

The second goal is to contribute to the effort of saving and archiving data. The team maintains an online spreadsheet with tabs for agencies that contain lists of datasets and URLs that are currently prioritized for saving. Volunteers sign up for a dataset, and then go out and get it. Some folks are manually downloading and saving files (pointing and clicking), while others write short screen scraping scripts to automate the process. The Data Rescue Project has partnered with ICPSR, a preeminent social science research center and repository in the US, at the University of Michigan. They created a repository called DataLumos, which was launched specifically for hosting extracts of US federal government data. Once data is captured, volunteers organize it and generate metadata records prior to submitting it to DataLumos (provided that the datasets are not too big).

DataLumos archive for federal government datasets, maintained by ICPSR

Most of the datasets that DRP is focused on are related to the social sciences and public policy. The Data Rescue Project coordinates with the Environmental and Government Data Initiative and the Public Environmental Data Partners (which I believe are driven by non-profit and academic partners), who are saving data related to the environment and health. They have their own workflows and internal tracking spreadsheets, and are archiving datasets in various places depending on how large they are. Data may be submitted to the Internet Archive, the Harvard Dataverse, GitHub, SciOp, and Zenodo (you can find out where in the Data Rescue Tracker Download’s List).

Mega Projects

There are different approaches for tackling these data preservation efforts. For the Data Rescue Project and related efforts, it’s like attacking the problem with millions of ants. Individual people are coordinating with one another in thousands of manual and semi-automated download efforts. A different approach would be to attack the problem with a small herd of elephants, who can employ larger resources and an automated approach.

For example, the Harvard Law School Library Innovation Lab launched the Archive of data.gov, a large project to crawl and download everything that’s in data.gov, the US federal government’s centralized data repository. It mirrors all the data files stored there and is updated regularly. The benefit of this approach is that it captures a comprehensive amount of data in one go, and can be readily updated. The primary limitation is that there are many cases where a dataset is not actually stored in data.gov, but is referenced in a catalog record with a link that goes out to a specific agency’s website. These datasets are not captured with this approach.

If trying to find back-ups is a bit bewildering, there’s a tool that can help. Boston University’s School of Public Health and Center for Health Data Science have created a find lost* data search engine, which crawls across the Harvard Project, DataLumos, the Data Rescue Project, and others.

Beyond the immediate data preservation projects that have sprung up recently, there are a number of large, on-going projects that serve as repositories for current and historical datasets. Some, like IPUMS at the University of Minnesota and the Election Lab at MIT focus on specific datasets (census data for the former, election results data for the latter). There are also more heterogeneous repositories like ICPSR (including OpenICPSR which doesn’t require a subscription), and university-based repositories like the Harvard Dataverse (which includes some special collections of federal data extracts, like CAFE). There are also private-sector partners that have an equal stake in preserving and providing access to government data, including PolicyMap and the Social Explorer.

Wrap-up

I’ve been practicing my Python screen scraping skills these past few months, and will share some tips in a subsequent post. I’ve been busy contributing data to these projects and coordinating a response on my campus. We’ve created a short list of data archives and alternative sources, which captures many of the sources I’ve mentioned here plus a few others. My library colleagues in the health and medical sciences have created a list of alternatives to government medical databases including PubMed and ClinicalTrials.gov

Having access to a public and robust federal statistical system is a non-partisan issue that we should all be concerned about. Our Constitution justifies (in several sections) that we should have such a system, and we have a large body of federal laws that require it. Like many other public goods, the federal statistical system contributes to providing a solid foundation on which our society and economy rest, and helps drive innovation in business, policy, science, and medicine. It’s up to us to protect and preserve it.

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

At These Coordinates

Dispatches from the Geospatial Data World