Ideal Session Definition
peterjakus [Yesterday at 1:44 PM]
- What would be your ideal session definition (for B2C ecommerce)?*
Session does not exist. It's an arbitrary construct, right? You might have figured that out yourself or perhaps you've read excellent @simoahava’s post https://www.simoahava.com/analytics/the-schema-conspiracy/. Now imagine that you could define a session anyway you'd like :star-struck: :exploding_head: , what would you do (let's say for b2c fashion ecommerce, or for online jewellery store)? I have this opportunity and I'd like to make the most of it. I'll throw here some thoughts and I hope that the brilliant minds here will help me to start an inspiring discussion.
First, what I don't like about GA's definition (https://support.google.com/analytics/answer/2731565?hl=en) :
- I'm OK with session ending after X amount of inactivity but I don't like :-1: that timestamp of session end is timestamp of last event = by default time spent on last page is not calculated and not counted into session duration. :muscle: Better option: Send a ping every few seconds (while the page is in foreground) and close the session after X amount of no ping
- :-1: Session ends at midnight. Seriously? What's the business value/reasoning for this?
- :man-shrugging: New session is created when user comes from other campaign source. Here I do get that it's helpful for attribution, however I don't like that :-1: it's inconsistent: - _2 clicks from *the same* Google Ads autotagged campaings = *2 sessions*_ - _2 clicks from *the same* UTM tagged Google Ads campaign = *1 session*_ - _2 clicks from *two different* FB campaigns (with the same source) = *1 session*_ - and it's not only this ad network inconsistency, there is also that you can't rely on a fact that if a person had a 3 sessions it means he interacted with over the course longer than one hour (maybe he did three 5second visits within one minute) (edited) Simo Ahava's blog The Schema Conspiracy There are some problems with Google Analytics' sessionization schema. I outline them in this article.
71 replies
peterjakus [18 hours ago] So I'd say it's a no brainer to have consistent approach to all ad networks, however I'm not sure if that should be ALWAYS break session (when user comes from outside) or NEVER break session.
Johan Terpstra [18 hours ago] You’ve addressed most of the quirky points already. Throwing in a curveball… Why should it even be a fixed definition? Isn’t a session ultimately an attribute of the visitor? Some people poop in 2 minutes, some sit an hour on the loo. Shouldn’t those tracking systems therefore learn to adjust the definition to the user’s behaviour? If someone comes back every 35 minutes historically, that session definition shouldn’t be 30 minutes perhaps. Similar with the time after last event/view until session ends, that should perhaps also be visitor dependent. But that’s quite philosophical perhaps. The midnight cutoff is total BS. The campaign initiated sessions IMHO should be 1 session if within the session timeframe, whether fixed or user-adjusted. (edited)
Zabe Agha [18 hours ago] I run a startup and we had to build out a concept of a session from scratch--unfortunately, what ended up happening is that we ended up piggy-backing on the definition being used by everyone else for the sake of consistency.
So, everyone has a definition--like it or not--so we fell into the same camp. See this example from Shopify:
peterjakus [18 hours ago] @Zabe Agha before you did surrender to the common definition, what did you try?
Zabe Agha [18 hours ago] @peterjakus Before, we thought the idea of a session would be either to the end of a transaction (a purchase) or a long period of inactivity--I think we were thinking 72 hours (which made sense for us in the context of our clients who are primarily B2C)
peterjakus [18 hours ago] @Johan Terpstra Personalised session definition? Interesting! I'm planning to do a blog post about personalised conversion rate (it also just philosophical idea atm, hopefully I'll come up with some practical proposals as well) but visitor depended session definition never even crossed my mind. Did you think through the implications on analysis? I didn't but my head spins :drunkhelbs: just when hearing it.
peterjakus [18 hours ago] @Zabe Agha 72 hours? Wow. That could actually work. So you perhaps didn't plan to use "Sessions brought by Campaign" at all (because you'd quite often have several campaigns assisting in the same session), right?
Johan Terpstra [18 hours ago] Nope, I did not yet consider the ramifications. Some metrics would likely become unusable in terms of comparisons. The question here is how to pigeonhole something. In Holland we have this concept of “omdenken” which is something like thinking out of the box, thinking differently, so then the immediate reaction becomes “Why even pigeonhole it?“. Hence the idea of a fluid definition, user oriented. Chances are it completely breaks everything haha. (edited)
Johan Terpstra [18 hours ago] A post-last-activity grace/cooldown period makes a lot of sense to me.
peterjakus [18 hours ago] pls explain more @Johan Terpstra
lukasvermeer [18 hours ago] Stupid question: why define a "session" in the first place? I mean: what do we need it for? It's a broken construct, so can we not just simply do without it?
peterjakus [18 hours ago] @lukasvermeer what would you use (instead of session based conversion rate) for evaluating landing page performance?
lukasvermeer [18 hours ago] User based conversion rate (or bounce rate or whatever)?
lukasvermeer [18 hours ago] Or is the theory here that every time the user lands they have completely and utterly forgotten about the last time they used this particular website?
peterjakus [18 hours ago] that's a good point
lukasvermeer [18 hours ago] I mean, even if this is some super high frequency purchase website, and we want to somehow measure repeat purchase, wouldn't we want to look at purchases _per user_?
lukasvermeer [18 hours ago] This whole session concept has had me seriously puzzled for years. I simply cannot think of a reason for it to exist.
lukasvermeer [18 hours ago] Unless you're a mathematician and you want to simplify some probability formulas...
peterjakus [18 hours ago] however for CRO purposes, I can still see a value in sessions. bounce rate without a session? is that even possible? and user based conversion rate sounds cool, but we've tried to use it and in most cases (if you aren't selling one-off things) it's worth to combine is with session based conversion rate. Or rather use Uplift on Revenue-Per-Visitor compared between AB test variants for granular analysis + session based conversion rate for high level overview.
peterjakus [18 hours ago] Now guys, what do you think about this situation: - visitor opens a page, reads it, goes to have lunch but keeps the tab open. - comes back to the page 40 minutes later and will read, scroll, move the mouse, but won't trigger any pageload - would you trigger a new session? (you can skip this question @lukasvermeer :slightly_smiling_face: ) - would you trigger a pageview (this one is for you also, Lukas) ? - or shall we divide sessions in something like "session" and "session flashback"..... and differentiate between page_load and page_view?
lukasvermeer [18 hours ago] Your scenario describes a users who interacted, therefore didn't bounce, but also didn't purchase, so yeah. :man-shrugging:
Johan Terpstra [18 hours ago] Without some kind of fenced off grouped of events/views, if you only had per user metrics or singular hits, it would also become very difficult to track improvements. If this month your KPIs per session are X and you invest a ton of money to improve it, sessions do help compare with the next month. Drawing comparisons with the analogue world, sessions also make sense. There’s a start and an end, within it are grouped together a bunch of somewhat connected events. How long it takes, does differ per user and task.
lukasvermeer [18 hours ago] Sounds like you use "session-based conversion rate" the way I would use "conversions per user"?
lukasvermeer [18 hours ago] @Johan Terpstra are you suggesting that using session-based metrics allow one to compare KPIs month over month?
lukasvermeer [18 hours ago] (Because if you are, I very very very much disagree.)
Johan Terpstra [18 hours ago] In Peter’s example, if you keep pinging the open tab, but no engagement is recorded, perhaps you should keep the session but have active and inactive time?
Johan Terpstra [18 hours ago] Certain KPIs, yes. Though it’s late and likely I’m not thinking super sharp.
Johan Terpstra [18 hours ago] This is a bit of a rabbit hole!
lukasvermeer [18 hours ago] I'm in NL too. It's late for me too. :wink:
lukasvermeer [18 hours ago] It's not a rabbit hole. It's simple: don't compare over time _unless_ you can somehow argue that the sample doesn't change over time.
lukasvermeer [18 hours ago] Which might work in a medical setting, but not on the Interwebs.
lukasvermeer [18 hours ago] I mean, genes drift and all, but not _that_ fast.
lukasvermeer [18 hours ago] Whereas PPC and SEO changes will likely change our samples more often than we realise.
peterjakus [17 hours ago] @lukasvermeer I love this! - _Number of pageviews_ would be used for evaluating reach or campaigns (e.g. SEO)... segmented by length of pageview, enagement level with page... - conversion per user as a high level metric - uplift on revenue per visitor to evaluate particular campaign...
that COULD work. I can see life without a session now :slightly_smiling_face:
BUT
How do you define USER? Cookie? List of cookies that you've managed to associate to a certain person id?
peterjakus [17 hours ago] I understand that in booking.com you would have it easier (because you know how to motivate people to log in)
lukasvermeer [17 hours ago] So there's some bad news also.
lukasvermeer [17 hours ago] "User" has many of the same problems as "session".
lukasvermeer [17 hours ago] Just less so. :champagne:
lukasvermeer [17 hours ago] (And no we do not have it easier, I think.)
Cory Underwood [17 hours ago] Sessions are useful, because they often match what the platform thinks of as a session, because platform has to allocate memory / storage to maintaining a session, and so there is actual incentive, to end the session when it's not needed.
lukasvermeer [17 hours ago] @Cory Underwood are you arguing it is a useful metric because the machines that run the website care?
lukasvermeer [17 hours ago] Hardware is cheap. Just buy more servers.
Cory Underwood [17 hours ago] The machines care, and so if you tie a specific visit to the same the same time the machine cared, it's a good to have a common frame of reference.
peterjakus [17 hours ago] @Cory Underwood do they really match in your experience?
Cory Underwood [17 hours ago] @peterjakus they do in our system, because I made sure of it for debugging purposes :wink:
lukasvermeer [17 hours ago] I'd gladly trade machine memory / storage for revenue per user.
lukasvermeer [17 hours ago] I'm curious though: why would you need to tie them?
lukasvermeer [17 hours ago] Oh wait, is this a client-side measurement thing?
lukasvermeer [17 hours ago] For me, logging when the machine cares == source for the experimentation platform. Same pipe. They are aligned by definition.
Cory Underwood [17 hours ago] Say someone had a problem, and you think it may be one of the application servers. While if your tying that t to the analytic session, you can back track it to the specific box and analyze what was going on.
lukasvermeer [17 hours ago] Right. I get it now. Have you read this? https://medium.com/booking-com-development/moving-fast-breaking-things-and-fixing-them-as-quickly-as-possible-a6c16c5a1185 Medium Moving fast, breaking things, and fixing them as quickly as possible How we use online controlled experiments at Booking.com to release new features faster and more safely. Reading time 7 min read Feb 21st
Cory Underwood [17 hours ago] I have.
lukasvermeer [17 hours ago] Seems that this is not a feature of using session based metrics, but the level of integration between application servers and experimentation platform.
lukasvermeer [17 hours ago] But I can see how, if you don't have that integration, you could use session based metrics.
lukasvermeer [17 hours ago] Still. That would mean they are useful for debugging mostly, but not experimentation per se?
lukasvermeer [17 hours ago] (@Johan Terpstra was right btw: it is waaaaaay past bedtime here.)
Cory Underwood [17 hours ago] In experiments it's more open to debate. I guess it'd depend on what you are measuring to a large extent.
Cory Underwood [17 hours ago] If you wanted to know 'how many visits...' till some action, you'd have to be able to break up the time blocks into something where you could count them.
Cory Underwood [17 hours ago] I'd also argue that due to cross device, 'user' has a whole different class of problems unless you can make them self identify.
Cory Underwood [17 hours ago] and to @lukasvermeer - comparing over time may be important, if you wanted to measure say - visits or days to something like a purchase, or a credit card application, because it helps frame up what your normal sales cycle may be in terms of length.
zjuul [17 hours ago] The term _visit_ is useful to communicate, like a metaphore. "I'm visiting a shop, do something, then leave"
For an app like instagram, this _visit_ metaphore is utterly useless. If I were to model an app like this, I would definitely not use "visit".
When you cluster together measurements in a way that makes sense to *your* model, a _visit_ or _session_ will be replaced by some richer vocabulary.
- a "price check" (booking) - a "quick refresh" (social network) - a "period of being bored" (news site)
When you cannot cluster measurements together yourself, you're stuck with factory default metrics, like _session_. Not bad per se - at least you have a shared vocabulary.
Zabe Agha [16 hours ago] @peterjakus >@Zabe Agha 72 hours? Wow. That could actually work. So you perhaps didn't plan to use "Sessions brought by Campaign" at all (because you'd quite often have several campaigns assisting in the same session), right?
That's correct.
Zabe Agha [16 hours ago] @lukasvermeer > Unless you're a mathematician and you want to simplify some probability formulas...
This is it. Because this is exactly what we're doing. Predicting the probabiliy of: add to cart, purchase, abandonment.
Zabe Agha [16 hours ago] @peterjakus Your example is 2 sessions given the current "definition"
peterjakus [16 hours ago] @Cory Underwood so what session definition are you using? I suppose not GA with the midnight cutoff and other quirks
Cory Underwood [14 hours ago] @peterjakus same as the server, a block of time followed by 30 minutes of no activity. However, we also pass the session id of the server in as a Custom Dim, so that even if the GA session resets due to attribution, we can map it back.
simoahava [12 hours ago] Well, I guess GA uses sessions because it understands how fragile user calculations are. Having a user-based conversion rate in a browser-based analytics tool with flimsy first-party cookies keeping count would lead to a lot of problems when explaining quirky data point, much more than with sessions
simoahava [12 hours ago] Also, legacy. And because it’s a suitable level of abstraction to mirror changing intent when coupled with the acquisition channel.
simoahava [12 hours ago] In my ideal world, we’d still have sessions, but they would be fluid, adapting to whatever ML or AI decided what the user’s current attention span is.
Started watching a video? Extend the session to last at least until the video is over.
Landed from Facebook to a blog post? Stop the session as soon as the user leaves the site.
It’s useful to have some meaningful chunk of data smaller than a user but bigger than a hit.
lukasvermeer [7 hours ago] Meta comment: this thread is in dire need of subthreads. :sweat_smile:
simoahava [7 hours ago] True :smile: Sometimes I think Slack misses so many features from Reddit, and vice versa