What is GA4 — Complete Mastery of Google Analytics 4 Concepts and Structure Part 1
Complete Mastery of GA4 from Beginning to End: What It Is, Why It Was Created, and How It Works
I open GA4 every day and look at the numbers. Visitors, sessions, events. But do you know where and how those numbers are created?
It's not impossible to run a site without knowing this. However, there's quite a big difference between understanding GA4's structure and looking at the numbers, versus just looking at the numbers. Even when looking at the same data, some people extract insights while others just say "the visitors increased" and move on.
This series covers GA4 from beginning to end. In Part 1, we'll explore why GA4 was created, what's different from the previous version, and how data is collected and stored structurally. Once you grasp the concepts, you'll be able to read subsequent reports correctly.
The Background of GA4's Emergence
The history of Google Analytics began in 2005. Born when Google acquired Urchin Software, this tool subsequently became the standard in web analytics. In 2012, it underwent a major upgrade under the name UA (Universal Analytics), and was subsequently installed on tens of millions of sites worldwide for about 10 years.
Then in 2020, Google released an entirely newly designed analytics tool. GA4.
Why did Google abandon the well-functioning UA and create something new? There are three reasons.
The Rise of Smartphones and Apps
When UA was being designed, most web usage happened on PC browsers. But as smartphones became widespread, the situation changed. People started using the internet through apps, mobile browsers, and tablets.
UA was structurally designed as a tool for web browsers. It was very difficult to integrate and analyze apps and web in a single tool. For example, if a user viewed a product on a smartphone app and later purchased it on a PC browser, UA had difficulty connecting these two actions to the same person. This was because the devices were different, the sessions were different, and the cookies were different.
The Limitations of Cookie-Based Tracking
UA relied heavily on browser cookies to identify users. But from the late 2010s onwards, cookie regulations began strengthening globally.
Europe's GDPR, Apple's ITP (Intelligent Tracking Prevention), and Chrome's announcement to discontinue third-party cookies. This trend shook cookie-based tracking itself. UA had a structure that was difficult to adapt to this change.
The Need for Machine Learning and Predictive Analytics
The paradigm of data analysis was changing. It was evolving from seeing "what happened in the past" to predicting "what will happen in the future." UA's structure was strong at aggregating historical data, but the design itself wasn't suited to incorporating predictive analytics using machine learning.
GA4 is a tool completely redesigned from scratch to solve these three problems. It was officially launched in October 2020, and on July 1, 2023, Google completely stopped collecting data from UA. Now GA4 is the only standard.
What's Different Between UA and GA4
On the surface, both look like "tools that show visitor numbers." But the way data is collected and stored is fundamentally different. If you don't understand this difference, you can't interpret GA4's numbers correctly.

Session-Based vs Event-Based
UA's data unit is the session. When a user accesses a site, a session is created, and all page views, events, and conversions occurring within that session are attributed to that session.
GA4 is different. GA4's data unit is the event. Everything that happens on a site is an event.
| Action | GA4 Event Name |
|---|---|
| Opened a page | page_view |
| Scrolled a page to 90% | scroll |
| Clicked an external link | click |
| Started playing a video | video_start |
| Completed a purchase | purchase |
Sessions still exist in GA4, but they're no longer the basic unit of data. A session starts with a session_start event, and all subsequent events have a session ID attached as a parameter. The session has become an attribute that describes the relationship between events.
Thanks to this structure, GA4 can handle web and apps in the same way. The screen_view event when viewing a screen in an app and the page_view event when viewing a page on the web have the same structure. This is why you can integrate and analyze web and app data in one property.
Changes in User Identification Method
In UA, the primary way to identify users was the client ID stored in browser cookies. If the cookie was deleted, a different browser was used, or a different device was used, the user was recognized as a different person.
GA4 uses three methods hierarchically for user identification.

| Priority | Method | Description |
|---|---|---|
| 1st | User ID | Assign unique ID to logged-in users. Same user recognized across devices |
| 2nd | Google Signals Data | If Google account is logged in + ad personalization is enabled, Google identifies by account |
| 3rd | Device-Based Identification | Browser cookie (client ID) or app instance ID. Similar to UA method |
This identification priority is controlled by GA4's Reporting Identity setting. You can select which method to prioritize in GA4 Property → Data Display → Reporting Identity.
Flexibility of Event Structure
In UA, to collect custom events, you had to fit into 4 fixed fields: category, action, label, and value. Data that didn't fit this framework had to be forced in.
GA4 allows free design of event names and parameters. Using a blog post reading event as an example, it can be structured like this:
Event Name: post_read
Parameters:
post_title: "What is GA4"
post_category: "Dev"
read_percentage: 85
reading_time_seconds: 420
To contain the same data in UA, you had to put "post_read" in category, "Dev" in action, the title in label, and separately set up a custom dimension for read percentage. GA4's approach is much more intuitive.
Free BigQuery Integration
In UA, BigQuery integration was only possible with GA4 360 (paid version). The ability to export raw data was a paid feature.
GA4 provides BigQuery export for free. The reports in the GA4 interface show data that Google has pre-aggregated, and sampling may occur depending on circumstances. Raw data exported to BigQuery is stored as individual records one by one without sampling. Direct analysis with SQL allows for a depth of analysis impossible in the GA4 interface.
Below is a table summarizing the key differences between UA and GA4.
| Item | UA (Universal Analytics) | GA4 |
|---|---|---|
| Data Unit | Session | Event |
| Web+App Integration | Difficult (separate tool needed) | Natively supported |
| User Identification | Cookie-based | Hierarchical (User ID → Signal → Cookie) |
| Event Structure | Fixed category/action/label/value | Free design of name + parameters |
| BigQuery | Paid version only | Free |
| Predictive Analysis | Not supported | Machine learning-based prediction supported |
| Service Status | Ended July 2023 | Current standard |
Event-Based Data Structure
GA4's core is events. Understanding how events are classified and what structure they have makes subsequent setup and analysis much clearer.
Four Types of Events
GA4 events are divided into four types based on collection method.

1. Automatically Collected Events
These are automatically collected simply by installing the GA4 tag. No separate configuration is needed.
| Event | When It Occurs |
|---|---|
first_visit | When a user visits the site for the first time |
session_start | When a session starts |
user_engagement | When staying 10+ seconds, conversion occurs, or 2+ pages viewed |
2. Enhanced Measurement Events
These are additionally collected when you enable enhanced measurement in GA4 data stream settings. GA4 automatically detects them without code writing.
| Event | When It Occurs |
|---|---|
page_view | When a page loads (enabled by default) |
scroll | When a page is scrolled 90% or more |
click | When an external link is clicked |
view_search_results | When a site search results page is viewed |
video_start | When YouTube video playback starts |
video_progress | When YouTube video reaches 10%, 25%, 50%, 75% |
video_complete | When YouTube video playback completes |
file_download | When a file is downloaded |
3. Recommended Events
Events that Google recommends by industry. GA4 doesn't automatically collect them, but if you follow the specified name and parameter format, GA4's standard reports will automatically aggregate them.
Taking e-commerce related recommended events as examples:
| Event | Meaning | Required Parameters |
|---|---|---|
view_item | Product detail page view | items |
add_to_cart | Added to cart | items, value, currency |
begin_checkout | Checkout started | items, value, currency |
purchase | Purchase completed | transaction_id, value, currency, items |
These events must have the exact name and parameters to be processed correctly in GA4 reports.
4. Custom Events
Events that don't fall into the above three types—you design and implement them yourself. You define behaviors you want to collect that match your site's characteristics, and write code or set up in Google Tag Manager (GTM) to send events when those behaviors occur.
One important note: custom event data doesn't automatically appear in GA4's standard reports. To analyze it in reports, you must set up a custom definition to register that parameter as a dimension or metric. This is covered in detail in Part 5.
Event Structure: Name and Parameters
All events in GA4 consist of two elements.
- Event Name: Indicates what action occurred. Things like
page_view,scroll,purchase. - Parameters: Additional information about that event. Information like which page was viewed or which product was purchased.
When a user views a specific post on this blog, the page_view event that occurs actually has this structure:
Event Name: page_view
Parameters:
page_location: "https://dechive.info/archive/ga4-introduction"
page_referrer: "https://www.google.com"
page_title: "What is GA4"
engagement_time_msec: 0
Every time an event occurs, this data is sent to Google's servers, where GA4 stores it, aggregates it, and displays it in reports.
Parameters come in two types.
| Type | Description | Example |
|---|---|---|
| Event Parameter | Context information when that event occurs | Page URL, product price, category |
| User Property | Information about the user itself, independent of events | Login status, membership grade, language setting |
Once user properties are set, they're automatically attached to all subsequent events and sent.
GA4's Data Collection Flow
Understanding the process GA4 goes through from data collection to report display helps you identify causes when data looks unusual later.

Step 1: User Action Occurs
Users access the site, view pages, scroll, click, and make purchases. All these actions are the source of data.
Step 2: Event Generation on the Client
The GA4 tag installed on the site detects the user's action and creates event data. This process happens with JavaScript executing inside the user's browser.
If enhanced measurement is enabled, the GA4 tag automatically detects scrolling, clicking, page transitions, and so on. For custom events, code written by developers or tags configured in GTM generate data at the appropriate time.
Step 3: Transmission to Google Servers
The generated event data is sent to Google's collection servers. It's sent via HTTP POST request, and you can confirm it in the Network tab of browser developer tools as a google-analytics.com/g/collect request.
There's one important thing to know here. If this transmission fails, data is lost. If users navigate away too quickly or ad blockers prevent GA4 requests, data may not be collected. GA4's collected data doesn't reflect 100% of actual user behavior. It should be understood as a tool for understanding trends.
Step 4: Data Processing and Storage
Event data that reaches Google's servers goes through a processing phase. Sessions are calculated, bot traffic is filtered, geographic location and device information are categorized. Filters you've set up or event modifications are also applied during this step.
This processing takes time. According to Google's official documentation, most GA4 report data is fully processed and displayed within 24-48 hours of event occurrence. However, real-time reports can be checked within minutes separately. To accurately see today's data in standard reports, it's safe to check the next day.
Step 5: Display in Reports
Processed data is displayed in the GA4 interface. GA4 reports don't show raw data as-is, but rather pre-aggregated data. When the amount of data is large, sampling may occur.
To see raw data—individual records of each event—BigQuery integration is necessary. In BigQuery, raw event data is stored by date in tables with the format events_YYYYMMDD. This will be covered in detail in the latter part of the series.
GA4's Account Structure
When first setting up GA4, the account structure can be confusing. GA4 has a 3-level structure: account, property, and data stream.

| Level | Description | Example |
|---|---|---|
| Account | Top-level unit. One business or organization | Dechive Account |
| Property | Basic unit of analysis. Web+app integration possible. Measurement ID (G-XXXXXXXXXX) assigned | Dechive Property |
| Data Stream | Channel where data flows in. Divided into Web / Android / iOS | Dechive Web Stream |
You can create multiple properties within one account, and add multiple data streams within one property.
GA4 Key Terminology Summary
Terms frequently encountered in GA4 reports. Understanding their precise meaning is essential for reading numbers correctly.
User
GA4 aggregates users in two ways.
| Category | Description |
|---|---|
| Total Users | Users who visited at least once during the selected period |
| Active Users | Users who started an engaged session, triggered a conversion event, or first launched an app |
The "users" number displayed by default in GA4 reports is the active users count. Don't directly compare with UA's user figures.
Session
A unit of interaction when a user accesses a site. In GA4, a session starts with a session_start event and ends after 30 minutes of no interaction. Unlike UA, sessions aren't forcibly split at midnight.
Engaged Session is a concept newly introduced in GA4. It refers to a session meeting at least one of these three conditions:
- Session where user stayed on site for 10+ seconds
- Session where a conversion event occurred
- Session where 2+ pages or screens were viewed
Conversion
Important events related to business goals. When you mark a desired event as a conversion, GA4 reports aggregate it separately, and you can use it as an optimization objective when linking with Google Ads. This is the equivalent of UA's "goal."
Dimensions and Metrics
The two most fundamental concepts for understanding GA4 reports.
| Concept | Description | Form | Example |
|---|---|---|---|
| Dimension | Criteria for classifying data | Text (string) | Country, device type, channel, page URL |
| Metric | Measurable numerical values | Numbers | User count, session count, conversion rate |
Reports always combine "what dimension to divide by" and "what metric to view." Examples include "active users by country" or "conversion rate by channel."
Engagement Rate and Bounce Rate
| Indicator | Definition | Better When |
|---|---|---|
| Engagement Rate | Ratio of engaged sessions to all sessions | Higher |
| Bounce Rate | Ratio of non-engaged sessions to all sessions (= 100% - Engagement Rate) | Lower |
UA's bounce rate and GA4's bounce rate are calculated differently. Don't directly compare bounce rate figures between the two tools.
Summary
The content covered in Part 1 is compressed into three points.
First, GA4 is event-centric. Everything that happens on the site—page views, clicks, scrolls, purchases—are all events. These events are the raw material for GA4 analysis.
Second, GA4 analyzes centered on users. It aims to connect actions across multiple devices to a single user. Unlike UA which relied only on cookies, GA4 hierarchically leverages user IDs and Google signals data.
Third, GA4's data goes through a processing phase. From event occurrence to report display takes 24-48 hours, and the data in reports is in aggregated form. If you need raw data, use BigQuery.
In Part 2, we'll dive deeper into GA4's event design. We'll cover what events you should collect, how to design events and parameters so analysis produces useful data, and what the criteria are for conversion design.
