A New Kind of Autotrack

meaningful stories

Heading

Vikram Aditya

CEO & Co-founder

August 1, 2023

MIXPANEL

It’s safe to assume that if you’re reading this, you’re at least moderately acquainted with Mixpanel and why its so popular.

So, I won’t be elaborating too much on this part.

Nevertheless, here’s a brief recap:

Mixpanel’s approach targets three pillars: Events, users, and properties.

Events: Literally speaking, any interaction a user has with your product is an event

Users: Self explanatory

Properties: Traits/characteristics of particular events and users

The product unlocks six primary use cases:

A: EASILY DIGESTIBLE ANALYTICS:

‍With its dashboards and reports, it enables an easy perusal of data that is crucial for understanding user behavior, patterns, and trends.

B: A/B TESTING:

‍By providing an easy to use hub to ascertain which variables affect user behavior, Mixpanel makes it far easier to make educated product decisions.

‍

C: RETENTION ANALYSIS:

‍The product further aids decision making by providing insights about user retention, and which features and updates can improve it.

I: USER SEGMENTATION:

‍By allowing the segmentation of users by their behaviours or characteristics, Mixpanel enables the analysis of how different functionalities impact different cohorts of users.

II: FUNNEL ANALYSIS:

‍Using this feature, users are able to see events sequentially and customer behaviour while leveraging a granular look at the user flow.

‍

All this functionality sounds like a lot to effectively utilize though, right?

It is.

Retrieving and analyzing the data and insights that come with product management is complex enough; learning how to optimally leverage a product that actually gives you this data is quite another.

A cursory look at any website with user feedback for SaaS tools would tell you the same-everyone praises Mixpanel's suite of functionality, but a fair amount is left to be desired with regards to its ease of use, which leads to users not being able to optimally leverage the tool. Moreover, it leads to a product teams’ potential not being fully realized.

There’s three salient ways in which Mixpanel falls victim to this:

‍

Mixpanel’s setup is notoriously difficult; not only does one need to be able to code, but doing so takes at least several days of work as well. This is primarily because powerful as it is, it can be quite a difficult tool to get the hang of.

PAINSTAKING WORKFLOW

Firstly, the excess usage of time and resources:

‍

‘Finding common metrics like revenue and repeat purchase for regular user groups is easy, but obtaining nuanced data for any experiments or research is time-consuming. This is because there are around 250-300 tables and numerous parameters that analysts must sift through to identify the appropriate metrics before writing the SQL query.’

‍

–Shaurya Sindhu, PM, Flipkart

‍

Precision required in setup means that small mistakes can lead to costly inaccuracies in data
Integrating with other tools in a company’s stack can be cumbersome
Possibility of delay in product fixes/updates

In large part, this is because of the fact that Mixpanel does not have autotrack capabilities. Autotrack would remove the need for users to manually have to configure Mixpanel to suit their particular needs, which would consequently also remove the need for the currently requisite technical knowledge.

Of course, this not because Mixpanel didn’t consider pursuing it-they introduced autotrack in 2016, but soon abandoned the project. Indeed, there are real reasons for why it poses enough problems that make it not worth pursuing. As Mixepanel’s own VP of Product and Design Neil Rahilly says:

‍

“It is a fundamentally flawed approach that falls far short of its promise. If you use it, you’ll end up with limited and unreliable data, spend even more developer time and money trying to fix problems and inconsistencies, and expose yourself and your customers to major security and privacy risks.”

‍

Before understanding why autotrack may now (owing to advances in transformer models) be tenable, let’s have a look at the arguments for why it hasn’t been too viable until now.

‍

1. NEED FOR DEVELOPER BANDWIDTH

‍

Borrowing from Rahilly again, let’s look at an example, which will illustrate how each of the potential pitfalls discussed above could take place:

Let’s say you have a checkout page with three buttons, with them saying ‘Place Order’, ‘Continue Shopping’, and ‘Empty Cart’. Now, because autotrack/codeless analytics would track all the metrics, every time any of these buttons gets pressed would be recorded. So, the data you’d be getting about the checkout page is raw and unfiltered-you’ll have to rely on the (ostensibly) graphical interface in your analytics tool to specify that the ‘checkout attempted’ event is defined as any click on ‘place order’.

The problem here arises when you realize that there could be multiple ways for a user to check out-they could press enter, or there could be a one click checkout which bypasses this page entirely. Point is, to maintain data accuracy, there’d be edge cases that’d require you to manually input your parameters anyway-requiring the use of a developer.

Similarly, let’s say that you want to see how many users have bought a particular product. However, because your event collection is one size fits all, data for particular items isn’t collected at all-so you need a developer.

Furthermore, if you want to change anything in your app (such as the wording on a button), a tool using autotrack would likely falter, since it wouldn’t be able to ascertain that its the same button.

‍

2. SECURITY

‍

Finally, security is a significant hurdle for any no-code, autotrack solution. Essentially, the only way one would be able to get around it is to manually whitelist the data that is tracked.

Greco from Amplitude makes similar arguments, with the addition of founder Jeffrey Wang’s argument that autotrack doesn’t save time for a company as a whole, it ‘shifts it to a less scalable process’. In a nutshell, this is because while those who would have been responsible for what they’d track no longer have to, but others (such as Product Managers) would have to manually sort through new components and rename relevant events.

I know its' cliched it is to say ‘AI solves this’, especially today. But bear with me a bit-even a cursory thought about this makes it apparent that in this case, AI unironically does solve this, given the nature of the problems.

Now, let’s have a look at each of the issues brought up by Amplitude and Mixpanel themselves:

‍

3. INACCURATE DATA

‍

‍

The argument Rahilly makes here is that ultimately, autotrack is a net negative, as the only way it can be made tenable is by using developers’ bandwidth-defeating its purpose.

However, the overarching problem here is that existing autotrack tools are blunt instruments without the ability to discern for relevant data. Because modern LLMs are obviously great at discerning relevant information from large volumes of text, it is trivial to imagine a fine-tuned model that could do the same if trained upon a data set of relevant apps.

Therefore, there is no reason for an autotrack tool built today to have these flaws.

‍

4. TOO MUCH DATA

‍

‍

Similar to the problem above, the reason autotracking implies the collection of too much data is because there is no ‘sorting’ mechanism or means to sift through relevant data. An LLM that has been fine-tuned and enhanced with manual labeling would fix this, by enabling an autotracking tool with the ability to be selective in the data it tracks, based off of textual prompts.

‍

5. DOESN’T SAVE TIME

‍

Currently, autotracking necessitates that people need to manually update the specifications each time there is even a relatively trivial update to the product. Literally just changing the name of a button can cause tracking to malfunction. Because this is a hassle in and of itself, it nullifies the time saved upon setup. Obviously, LLMs would be able to decipher such changes, thereby maintaining the time saved.

‍

CONCLUSION

‍

No one argues that autotrack has potential benefits but just that currently, the costs are too high.

The fundamental problem underlying all these issues is the fact that autotracking necessitates a one size fits all approach. This is because the current tools used to automate the process do not have the ability to discern between data using particular sets of preferences like people do. By embodying people's preferences using a model, this issue can be solved from the ground up.

Simply put, AI allows one to digitally embody themselves-which means their preferences and decision making processes can be as well. ‍

This enables the benefits that autotracking can provide without the costs that till now, have been impossible to avoid.

Utilizing this value prop of AI is exactly what we've centred Crunch around.

The ease of autotrack without the need for manual tagging.

Signup for the waitlist here!

‍