The Hospitality Data You're Already Sitting On

Introduction

You Don't Have a Data Scarcity Problem

Most independent hospitality operators, asked whether they have enough data to make better decisions, will say no. The instinct is wrong. The reality of an independent hospitality operation, after two or more years of running, is the opposite: the data exists, it is generated daily, and it accumulates whether or not anyone is looking at it.

The constraint is not scarcity. The constraint is attention.

This spoke catalogues the operational data sources an independent hospitality business produces in the course of running, what each source tells the operator about their business, what is straightforward to read, what is harder, and how the sources connect to one another.

Data inventory: the sources your hospitality operation already generates — Every message, call, review, and booking pattern is a signal. Read as a set rather than a sequence, they map where bookings are won and lost.

The inventory is organised by source, listed in rough order of yield-per-effort. Messaging history is at the top because it is the highest-volume, highest-signal source available to almost every operator with two or more years of operating history. Spreadsheets and ad-hoc notes are at the bottom — while they often contain the most operator judgement, they are also the hardest to read systematically.

#1Yield per hour: platform messages + WhatsAppStart here before any other source

90 daysMinimum look-back window for pattern recognitionLess produces noise, not signal

2+ yrsOperating history before analytical work becomes reliableData accumulation threshold

Source 1 — Highest yield

Platform Messages

Across Airbnb, Booking.com, Vrbo, and any other OTA the operator uses, the messaging history is the single richest operational data source an independent hospitality business produces. Every guest interaction sits there: the inquiry, the questions before booking, the negotiation if any, the pre-arrival logistics, the during-stay coordination, the post-stay follow-up.

The yield-per-hour-spent on platform messages is the highest of any data source in this inventory. An operator who reads ninety days of messages with one specific question in mind will, almost without exception, find at least one recurring question they had not previously recognised as a pattern.

What it tells you

Read as a set rather than a sequence of operational events, platform messages reveal most of what matters about pre-booking guest behaviour: which questions appear most often before guests book; which questions appear most often before guests don't book (the conversations that go quiet are as informative as those that convert); which times of day produce the highest-intent inquiries; how response timing correlates with conversion; which language groups arrive in the inbox and how often the operator handles them well.

Practical challenge

The messaging history sits in different platforms, each with its own export format, none of which makes the data easily portable. Reading messages within each platform is straightforward — the platforms have search functions, and an operator can review the last ninety days inside the platform interface in two or three hours. Reading messages across platforms together is materially harder. Most operators end up doing the cross-platform analysis manually, by reading each platform's history in turn and taking notes.

Source 2

WhatsApp & Direct Chat

For operators in markets where WhatsApp is the dominant guest communication channel — most of southern Europe, Latin America, much of the United States vacation rental market — the WhatsApp threads often contain more guest interaction volume than any single OTA platform. They are also the messages most likely to contain substantive pre-arrival concerns, because WhatsApp tends to surface the questions guests didn't feel comfortable asking on the platform.

What it tells you

Platform messages capture the booking-flow conversation; WhatsApp captures the relationship conversation. Pre-arrival anxiety, the small details guests want clarified before showing up, the moments where guests need help during their stay, the questions that come back from previous guests turning into repeat-booking inquiries. The yield on WhatsApp data is highest for understanding post-booking, pre-arrival friction that determines whether the stay starts well, plus the during-stay issues that turn into review themes.

For operators where WhatsApp is the primary channel

This source belongs at the top of the inventory rather than second. The general rule: read whichever channel carries the most actual guest words first.

The practical challenge is structural: WhatsApp does not export easily, the search across threads is limited inside the app, and operators with multiple WhatsApp Business accounts (one per property or one per host identity) often have data fragmented across phone numbers.

Source 3

Reviews

Reviews exist across multiple platforms — Airbnb, Booking.com, Google, TripAdvisor, Vrbo. They are public and structured, which makes them the easiest data source in this inventory to aggregate and read systematically. Most operators are already reading their reviews individually. The El Dorado opportunity in reviews comes from reading them as a set rather than as individual feedback events.

What reviews tell you in aggregate

Reviews tell you what guests experience — framed in the post-stay reflective register, which is materially different from what the same guests communicated during the booking flow. A guest who asked five questions about parking before booking, then writes a positive review that mentions parking again, is signalling that parking is sufficiently important to them that it warrants attention before, during, and after the stay. A guest who didn't ask anything about parking and writes a negative review complaining about parking is signalling a different problem entirely — one of expectation-setting.

The most actionable use of review data is paired with messaging data: questions guests asked before booking that later appear as complaints in reviews are a direct signal that the listing or pre-arrival communication is not doing its work.

Reviews also carry language and origin signals: which language groups review most generously, which review most critically, which review at all. Some language groups review at materially lower rates than others, which biases an operator's review-score read of their own performance.

Source 4

Calls & Voicemails

For operators with a public phone line — common in higher-end vacation rentals, boutique aparthotels, and small hotel groups — the call log is a high-signal data source that almost no independent operator reads systematically. The volume is lower than messaging but the signal per interaction is often higher: guests who call rather than message are typically further along in the booking process, more decision-ready, and more likely to be asking questions that messaging didn't resolve.

Missed calls: the most under-read data

A missed call from an unrecognised number that does not result in a booking is, statistically, an inquiry the operator lost. Most operators do not track missed calls as inquiries because there is no booking attached, but the volume of missed calls during specific hours of the day is one of the cleanest available indicators of when the operator's response coverage is failing.

"The call log tells you what messaging is failing to answer."

A repeated call topic that never appears in messages signals that the messaging flow is not surfacing that particular concern early enough — the guest is reaching for the phone because message-based interaction left them unsure.

Source 5

Email & Direct Inquiries

Email is the residual channel for most independent hospitality operators. Direct-booking inquiries arrive there. Returning guests often reach out via email rather than through OTA platforms. Pre-arrival logistics for guests who don't use WhatsApp typically run via email.

The volume is lower than platform messaging or WhatsApp but the per-inquiry value is often higher: direct-booking inquiries through email mean the operator avoids OTA commission on the resulting reservation, and returning-guest emails carry the highest conversion-to-booking rate of any channel.

What email tells you

Email tells you which guests are seeking the operator out directly versus which are arriving via the OTA discovery flow. The pattern of which guests email — repeat guests, referral guests, longer-stay guests, higher-budget guests — is often instructive. An operator whose email volume from repeat guests is rising over time is seeing the early signs of a building direct-booking base.

Source 6

Calendars & Bookings

Booking calendars across platforms record what actually converted: dates, properties, guest counts, lengths of stay, prices, cancellation history. This is the data most operators are most familiar with — it is what their PMS displays, what their dashboards summarise, what their accountants reconcile against revenue.

What it tells you, beyond the basic occupancy and revenue numbers, is the shape of demand across time. Which months book earlier and which book later. Which properties hold longer-stay bookings versus shorter ones. Which days of the week produce the booking events versus which produce the inquiries. How cancellations distribute across lead times. Which channels produce the highest cancellation rates.

Most useful pairing: calendars + messaging

Matching inquiry timestamps against booking timestamps across the same period reveals the inquiry-to-booking conversion shape that operators almost never see directly. Conversion rate by channel, by language, by time of day, by season — all derivable from these two sources together.

Source 7

Payments & Pricing

Payment records and pricing history together form the financial ground truth of the operation. They tell the operator what was charged, what was actually collected, when revenue arrived, where price adjustments happened, and what the pricing patterns look like across seasons.

What is visible in payments and pricing data is, broadly, the operator's pricing discipline against demand. Did prices move in response to demand signals, or did they stay flat through predictable peaks? Where are the moments when the property held a price that the demand signal said should have been higher? Where are the moments where the price was too high and bookings collapsed?

The practical challenge is that pricing data is rarely consolidated. Most platforms show current prices and recent changes, but the full pricing history over the past two years often requires deliberate logging. Operators who maintain a simple pricing log alongside their PMS gain a meaningful analytical advantage over those who don't.

Source 8

PMS, Tools & Spreadsheets

Most independent operators eventually accumulate some combination of property management system data, channel-manager records, ad-hoc tool exports, and personal spreadsheets that capture decisions and notes the platforms don't. This source is variable in quality but often contains the operator's most considered judgement about the operation — the spreadsheet where the operator tracks why one property is underperforming, the ad-hoc note where they flagged a recurring guest issue.

What this data tells you, when read carefully, is the operator's own evolving theory of their business. Which patterns they already noticed. Which interventions they tried. Which questions they had been intending to look into and never got around to.

The practical challenge is that most spreadsheets and notes were never designed to be read systematically. They were working documents. Operators who occasionally consolidate their spreadsheet thinking into a written summary — once a quarter — find the historical record substantially more useful than those who let the spreadsheets accumulate without periodic synthesis.

The key move

How the Sources Connect

The single most underused move in independent hospitality data is pairing. Each source above is useful read alone. Each is materially more useful read against another.

The most productive pairings

Messages + Reviews — where pre-booking concerns and post-stay complaints overlap
Messages + Calendar — revealing the conversion shape across times of day, languages, properties
Reviews + Channel data — where the same property is reviewed differently by guests from different platforms
Calls + Messages — which questions messaging is failing to resolve early enough
Pricing history + Calendar — where pricing decisions and demand outcomes align or diverge

The point of pairing is not to assemble a comprehensive integrated dataset. It is to bring two sources together and let the comparison generate insight. An operator who reads messages alongside reviews for one quarter will surface more actionable findings than an operator who reads either source alone for a year.

Reading this inventory in full produces a recognisable reaction: "I have all of this. I am using almost none of it." That reaction is the right one.

The realistic move is to start with the source carrying the highest yield-per-hour — platform messages plus WhatsApp for most operators. Read ninety days with one specific question in mind. Identify what the data is telling you. Resolve the highest-frequency issue at the source. Then pair messages with reviews and let the overlap surface the next set of moves.

Spoke 1: The El Dorado Score Next: MCP for Hospitality