A mid-size IT integrator asked Partners to build a time-tracking system its employees and contractors would actually use. The brief was usability. The result was a voice-first system with no form. People speak short natural-language descriptions of their day into a phone, and an on-prem AI translation layer turns those notes into structured project, customer, and category data. Three sprints to deliver. By the end of month one, 98 % of the team used it daily. The data surfaced 15 % of total effort that had been going unrecorded: billable work tied to change requests, ad-hoc client conversations, and unplanned scope. In money terms: €1.95M of previously invisible billable effort, now in scope through contract renegotiation and corrected invoicing on running engagements.
The brief vs. the need
The integrator's leadership had bought time-tracking systems before. Two, to be specific. Both had been switched on, used for two weeks, and abandoned by the people they were supposed to capture. The pattern is familiar in mid-market services firms: every time-tracking tool ships with a form, and the form is the problem.
When a developer ends a Tuesday with seven hours of work spread across four projects and a half-day of unscheduled change requests, the form asks them to remember it all, structure it, classify it, and type it in. The form is asking the developer to do the operations team's job. So the developer logs the minimum (eight hours, four projects, one tag each) and goes home. The data the company actually runs on never reaches the database.
The consequences were clear. Project profitability was unmeasurable in real time. Resource planning was guesswork. Change requests went untracked. The work was being done. The form could not capture it without making the team's day worse. A meaningful share of billable effort never reached the database.
What was broken
The pre-state was familiar to anyone who has run a services business.
The old system tracked compliance at the coarsest possible level: worked time vs. holiday. That was it. No project allocation, no category, no customer attribution. Detail was supported in theory. Nobody entered it. The friction was too high to bother.
Operational reporting flowed through Excel. Each cycle, someone on the operations team pulled exports from the tracker, reconciled them against project rosters, and produced a report. It took real time. It depended on whoever was free to do it. By the time anyone could see what had been worked on, the moment to act on it had passed.
Underneath that sat the structural gap. Change requests, ad-hoc client conversations, and unplanned scope had no place in the form. They got absorbed into other line items, or vanished entirely. None of that effort reached the invoicing layer. A meaningful share of it was billable. None of it was ever invoiced.
What we built
The system replaced the form with a voice note.
Throughout the day, employees and contractors could speak natural-language descriptions of what they had just done, into their phone. "I spent half my day on the new feature for customer X." The voice notes were transcribed, interpreted, and converted into structured time entries by an AI translation layer that already knew the company's project roster, employee assignments, and customer relationships.
Each note resolved into:
- A duration estimate interpreted from the language ("half a day", "this morning", "an hour or so")
- A project identification matched against the company's active engagements
- A category (development, change request, meeting, internal, support, bug fix) derived from the description and the project context
- An effort allocation linked to the right customer-project context, ready for the invoicing pipeline
At month-end, each user received a generated overview: their recorded time, the underlying notes, the categorized activities, and a structured report ready for review. They could edit or refine any entry, but most didn't need to. The AI's first interpretation was good enough that the data landed usefully without manual cleanup.
Four principles drove the build:
- Minimal manual interaction. No structured forms. No required taxonomy. The system interpreted natural language and only asked the user when the input was genuinely ambiguous.
- Only the functionality each user needs. A developer's view differed from a delivery lead's, which differed from operations'. Most enterprise tools expose every feature to everyone; this one exposed only what the person opening it actually used.
- Local-only deployment. All AI processing, transcription, language understanding, categorization, runs on the client's own infrastructure. No client data leaves the building. For European mid-market services firms with data-sovereignty obligations and IP-sensitive client work, this is the only architecture that's actually deployable.
- Editability over correctness. The system never assumed it was right. It produced its best interpretation, showed the user what it had decided, and made overriding any entry easy. Users trusted the system because they could see and change what it had decided.
The build moved through three iterative sprints.
Sprint 1. A working time-tracking system end-to-end. Voice capture, AI interpretation, structured database, basic reporting. Deployable and usable at the end of the sprint.
Sprints 2 and 3. Iteration based on real usage. Edge cases in language parsing. Project-recognition tuning when descriptions were vague. Per-role views. Reporting refinements. By the end of sprint 3 the system was production-grade across the company.
The result that mattered
The adoption number was the test the brief had been written around. Most time-tracking deployments hit their failure point in week three. Adoption drifts down, the operations team starts chasing, and the tool eventually settles into a low-compliance state that everyone has agreed not to talk about. This one did not.
By the end of month one, 98 % of employees and contractors were using the system daily. That number had never been reached on either of the two previous tools the company had deployed. The system was passing the only test that matters in enterprise software: people were using it, voluntarily, because using it was easier than not using it.
Beneath the adoption number sat the financial outcome the brief had pointed at. The data surfaced 15 % of total effort that had been going unrecorded: work tied to change requests, ad-hoc client conversations, unplanned scope, and the small but constant favors that keep a services relationship alive. None of it had been visible before. A meaningful share of it should have been billed. In money terms, that share was worth €1.95M of invisible revenue across running engagements. None of it had been collectible before. It is now in scope.
The operational side effect was just as material. Excel-based reporting was eliminated entirely. The structured data sits live in the database, refreshed continuously, ready for any view the operations team wants to construct. The week between "what happened" and "what we know" closed completely. The reporting work that had previously required cycle-bound consolidation effort is gone. Not optimized, not reduced, gone.
What we didn't expect
Two findings landed once the data was real, neither of which had been in the brief.
The first was contractual. Across several running engagements, the captured-versus-planned numbers told a clear story: the company had underestimated delivery capacity at proposal time. The contracts needed renegotiation. Not because the work had grown, but because the original scope had been priced against a delivery cost the company could not actually meet. That conversation with clients is uncomfortable. Having the data made it possible. Without it, the projects would have kept bleeding margin until the engagement ended.
The second was operational. With effort visible across every project, weekly cross-project replanning got faster. Delivery leads could see in minutes where capacity was tight, where a hand-off was overdue, where a project was quietly slipping. Decisions that used to need a Friday review now happened on Tuesday.
Most enterprise software fails at adoption because it is built around the form. The form is the problem.
AI's role in the next decade of enterprise systems is not to replace the form with a smarter form. It is to remove the form, listen to people speak naturally, and translate what they say into the structured data the business runs on. The translation layer is the product.
The best tool is the one that works with almost no interaction from the user.
A diagnostic you can run this week, without us
If you run a mid-market services firm, here is what you can do this week, without hiring anyone, to find out whether you carry the same exposure.
Step 1. Ask three of your most senior delivery people what they worked on yesterday. Don't ask them to fill in a form. Ask them in person, casually, and write down what they say. Then compare their answer to what is logged in your current time-tracking system. If the form-tracked data is thinner than the conversation, your problem is not your team's discipline. It is your form. Your people know what they did. The form fails to capture it.
Step 2. Pull your last three months of change-request volume. Then pull the time logged against change-request categories. Compute the ratio. If less than one billable hour is logged for every change request issued, you are leaking billable effort at the structural level. Most of it lives in the gap between "we'll handle that quickly" and the line that should have appeared in next month's invoice.
Step 3. Run a one-week voice-note experiment. Ask the same three senior delivery people to dictate a 60-second voice note at the end of each day for one week, in their own words. Have someone categorize the notes against your project list. Compare to what your tracking tool captured. The gap is the size of the problem. It's a rough lower bound on the billable revenue you have been leaving in your own database.
The form-tracked data tells you what your team had time to type. The conversation tells you what your team actually did. The two should match. They almost never do.