For a hands-on learning experience to develop Agentic AI applications, join our Agentic AI Bootcamp today. Early Bird Discount

Data Integration

Organizations increasingly rely on Linkedin to build brand presence, run campaigns, and engage with their professional community. While LinkedIn provides native dashboards, most companies want to bring Linkedin data into their data warehouse, such as Azure Synapse, for unified analytics alongside CRM, financial, and other marketing data. 

This guide shows you how to: 

  • Get Linkedin Community Management API access 
  • Authenticate and fetch post statistics 
  • Build an Azure Synapse pipeline to land data into ADLS as Parquet 

Learn how to integrate Google Ads data into Azure Synapse for comprehensive marketing analytics.

Why Extract Linkedin Data? 

LinkedIn is one of the most valuable platforms for B2B marketing, employer branding, and thought leadership. However, relying solely on LinkedIn’s native dashboards limits how deeply organizations can analyze their campaign and engagement data. Exporting this data into Azure Synapse unlocks richer insights and advanced analytics by enabling cross-platform comparisons and custom reporting.

Here’s why organizations extract LinkedIn data into Azure Synapse:

  • Understand What Works: Identify which posts, campaigns, or content types generate the highest impressions, clicks, and engagement rates over time.
  • Know Your Audience: Analyze audience demographics such as job titles, industries, and company sizes to better tailor your messaging.
  • Measure ROI: Combine campaign data with ad spend and lead metrics to calculate true marketing ROI.
  • Create Custom Dashboards: Go beyond LinkedIn’s standard analytics with Power BI visualizations that blend multiple data sources.
  • Connect the Dots: Integrate LinkedIn analytics with data from Google Ads, Facebook, HubSpot, or Salesforce for a unified marketing performance view.
  • Optimize Performance: Use machine learning and automation within Azure Synapse to predict engagement trends and optimize posting strategies.

By centralizing LinkedIn analytics in Azure Synapse, businesses move from reactive monitoring to proactive decision-making — enabling data-driven campaign planning, deeper audience insights, and unified performance tracking.

Let’s Get Started with the Tutorial

In this section, we’ll walk through the complete setup process, from creating a Linkedin app and generating API access tokens to configuring a REST connection and building a data pipeline in Azure Synapse. By the end, you’ll have an automated workflow that pulls your Linkedin Page analytics directly into your Azure Synapse workspace for unified reporting and analysis.

Step 1: Create a LinkedIn App  

Creating a Linkedin App - Connecting Linkedin API to Azure Synapse

  • Fill in the required information, make sure you have a LinkedIn Page.

Creating a Linkedin App - Connecting Linkedin API to Azure Synapse

  • Go to the Settings tab and click the Verify button. Share the link with the administrator of your LinkedIn company page. The administrator must verify the app to grant it access to company data. Please, ensure that the app is verified.

Discover how to automate data updates from Azure Synapse to Excel on SharePoint using Power BI and Power Automate.

Connecting Linkedin API to Azure Synapse

Step 2: Request Access to the API & Generate Access Token 

  • Once your app is verified, navigate to the Products tab and request access to the necessary APIs. 
  • If you need access to LinkedIn Page data to work with organic content and page analytics, such as posts, followers, reactions, comments, shares, and engagement metrics, request access to Community Management API. The Community Management API enables developers to manage LinkedIn company pages on behalf of clients and access related account details (admins, roles, follower details) and analytics, including comments, reactions, and other page update activity. 

Linkedin Community Management API - Connecting Linkedin API to Azure Synapse

  • Fill in the Access Request Form 
  • Once the application has been approved for the development tier you will then have to generate access token. 

Connecting Linkedin API to Azure Synapse

  • Click OAuth 2.0 tools on the right-hand side of the page under Auth. 

Connecting Linkedin API to Azure Synapse

  • Click the Create token button to begin the authorization process. 

Creating Access Token for Linkedin Community API -Connecting Linkedin API to Azure Synapse

  • Select the required scopes for pages data  access: 
    • r_organization_social 
    • rw_organization_admin 

Setting scope for access - Connecting Linkedin API to Azure Synapse

  • After selecting the appropriate scopes, click Request access token. 

Request Access Token Connecting Linkedin API to Azure Synapse

  • On the next screen, click  Allow to authorize the app.

Connecting Linkedin API to Azure Synapse 

  • After the token is generated, copy and securely store your Access & Refresh Token. 

Connecting Linkedin API to Azure Synapse

Get a comprehensive understanding of REST APIs, essential for integrating external data sources like LinkedIn into your systems.

Step 3: Create a REST linked service in Azure Synapse or Data Factory using UI 

  • Browse to the Manage tab in your Azure Synapse or Data Factory workspace and select Linked Services, then select New: 

Connecting Linkedin API to Azure Synapse

  • Search for REST and select the REST connector. 

Connecting Linkedin API to Azure Synapse

  • Configure the service details, test the connection, and create the new linked service. 
  • Base URL: https://api.linkedin.com/ 
  • Authentication type: Anonymous 
  • Under Auth headers, click + New and add: 
    • Name: Authorization 
    • Value: Bearer <your_access_token> (replace with the valid LinkedIn access token, not the refresh token). 

Connecting Linkedin API to Azure Synapse

Step 4: Add and Configure the Data Pipeline 

Once your linked service is ready, it’s time to create the data pipeline that connects everything from LinkedIn API to your data lake. 

  • Add Copy Activity and Integration Dataset in Azure Synapse or Data Factory

Connecting Linkedin API to Azure Synapse

Connecting Linkedin API to Azure Synapse 

  • Name the integration dataset, select the linked service and open the dataset 

Connecting Linkedin API to Azure Synapse

  • Under Parameters, add a new parameter to make the relative URL dynamic 

Connecting Linkedin API to Azure Synapse

  • Under Connection, add dynamic content for the Relative URL and select the newly added parameter 

Connecting Linkedin API to Azure Synapse

  • Commit/Save the integration dataset 
  •  Fetch Monthly Aggregated Post Statistics 

We’ll use the organizationalEntityShareStatistics endpoint to extract post-level analytics (impressions, clicks, reactions, etc.) returns share(post) data only within the past 12 months, using a rolling 12-month window. 

Explore three powerful APIs that can elevate your data science projects.

Output Schema Example: 
Field  Type  Description 
clickCount  long  Number of clicks 
commentCount  long  Number of comments 
engagement  double  Engagement rate (clicks, likes, comments, shares per impression) 
impressionCount  long  Number of impressions 
likeCount  long  Number of likes 
shareCount  long  Number of shares 
uniqueImpressionsCount  long  Number of unique impressions 
Sample Request 

Dynamic URL for the Last 12 Months 

To automate the date range, use the following dynamic expression in Relative URL within your Copy Activity: 

 

Explanation: 
  • Base Path: Fetches LinkedIn page post statistics for a specific organization. 
  • Time Range: Defines a 12-month rolling window, from the start of the month one year ago to the current date. 
  • Dynamic Generation: Converts Azure date functions into LinkedIn’s required Unix timestamp format (milliseconds since 1970). 

Learn how to use Postman and Python for efficient API testing.

Additional Headers 
  • X-Restli-Protocol-Version: 2.0.0 
  • LinkedIn-Version: 202508 

Connecting Linkedin API to Azure Synapse

  • Under the Sink tab, select a Parquet integration dataset to store your LinkedIn data as Parquet files in ADLS. 

Connecting Linkedin API to Azure Synapse

  • Go to the Mapping tab: 
  • Click Import Schemas 
  • Change the Collection reference to elements[] 
  • Click Import Schemas again 

Remove any paging objects and fix malformed column names

Connecting Linkedin API to Azure Synapse

Step 5: Run and Validate 

  1. Debug the pipeline to execute your Copy Activity. 
  2. Verify the data output in your ADLS container. 
  3. Optionally, create a view in Azure Synapse serverless SQL pool for Power BI to consume. 

Connecting Linkedin API to Azure Synapse

Connecting Linkedin API to Azure Synapse

Summary 

By connecting the LinkedIn Community Management API with Azure Synapse or Azure Data Factory, you can automate the ingestion and transformation of critical marketing metrics — including:

  • Post and share analytics
  • Follower growth and engagement trends
  • Page performance metrics
  • Social metadata and reactions
  • Video engagement and view-through analytics
  • This automated integration ensures that your LinkedIn data flows seamlessly into Azure Data Lake Storage (ADLS) in Parquet format, ready for querying, transformation, or visualization in Power BI.

Ultimately, this setup empowers your marketing and analytics teams to:

  • Access real-time campaign insights in a single, scalable data warehouse
  • Combine LinkedIn data with CRM, financial, and web analytics datasets
  • Track performance and ROI across all digital touchpoints
  • Build richer, more actionable marketing intelligence dashboards

In short, integrating LinkedIn data into Azure Synapse transforms fragmented platform metrics into a cohesive, analytics-ready data foundation, enabling smarter, faster, and more informed business decisions.

Ready to build the next generation of agentic AI?
Explore our Large Language Models Bootcamp and Agentic AI Bootcamp for hands-on learning and expert guidance.

October 22, 2025

Data Science Dojo is offering Meltano CLI for FREE on Azure Marketplace preconfigured with Meltano, a platform that provides flexibility and scalability. It comprises four features, it is customizable, observable with a full view of data visualization, testable and versionable to track changes, and can easily be rolled back if needed. 

It is somewhat of a tiring process to install the technology. Then look after the integration and dependency issues. Already feeling tired? It is somehow confusing to resolve the installation errors. Not to worry as Data Science Dojo’s Meltano CLI instance fixes all of that. But before we delve further into it, let us get to know some basics.  

What is Meltano? 

Meltano is an open-source Command Line Interface (CLI) tool that offers a flexible and scalable solution for Extract, Load, and Transform (ELT) processes. It is designed to assist data engineers in transforming, converting, and validating data in a simplified manner while ensuring accuracy and reliability.

The Meltano CLI can efficiently handle complex data engineering tasks, providing a user-friendly interface that simplifies the ELT process. It can also integrate with different data sources, enabling users to extract data from various sources, load it into a target destination, and transform it according to their specific requirements.

In addition, it offers a range of plugins that extend its capabilities and allow users to customize their ELT workflows. These plugins include extractors, loaders, and transformers, among others.

Challenges for individuals

Before Meltano CLI, there were several challenges associated with data integration that made the process difficult and time-consuming. Here are a few of the main challenges: 

  • Lack of Standardization: Data integration tools were often proprietary, which made it difficult to integrate different tools and workflows. This meant that organizations often had to use multiple tools to complete a data integration project. 
  • Complexity: Many data integration tools were complex and required extensive knowledge of programming and data architecture to use effectively. This made it difficult for non-technical users to participate in data integration projects. 
  • Scalability: As data volumes grew, many data integration tools struggled to handle the scale of the data. This led to slow and inefficient data integration processes. 
  • Cost: Many data integration tools were expensive, which made them inaccessible for smaller organizations with limited budgets. 
  • Limited Customization: Many data integration tools offered limited customization options, which made it difficult to adapt the tool to fit the unique needs of an organization.

 

All in all, it was designed to address many of these challenges by providing an open-source, flexible, and user-friendly tool that can be customized to fit the unique requirements of users.

Meltano CLI for ELT
                                          Meltano CLI for ELT – Data Science Dojo

Why Meltano? 

Meltano CLI stands out as a data engineering tool. It provides flexibility and scalability. It comprises of four features, it is customizable, observable with a full view of data visualization, testable and versionable to track changes, and can easily be rolled back if needed.

Meltano CLI has solved many struggles that make it a compelling choice for many users, including: 

  1. Open-source: It is free and open-source, which means that users can download, use, and modify the source code as per their needs. 
  2. Easy-to-use: It is designed to be easy to use with a simple command-line interface and intuitive user interface. Users can easily configure, execute, and monitor data integration pipelines. 
  3. Customizable: Meltano CLI offers a high degree of customization, allowing users to define custom transformations, connectors, and integrations. 
  4. Modern stack: It is built using modern open-source technologies such as Python, Flask, and Vue.js, making it easy to extend and integrate with other tools. 
  5. GitLab Integration: Meltano CLI is developed by GitLab, which means it can be easily integrated with GitLab for version control, collaboration, and continuous integration and deployment (CI/CD). 


Overall, Meltano CLI is a powerful and flexible data integration tool that offers a unique set of features and benefits that may make it a good choice for certain data integration projects. However, the choice of tool ultimately depends on the specific needs and requirements of the project at hand.
 

Integrations

MeltanoHub is the primary location to find all plugins, including Singer taps and targets. It serves as a single source of truth for users, making it easy to discover and use plugins within Meltano. Additionally, users can contribute to the Hub by adding more plugins, which are immediately accessible.

The Hub is maintained by Meltano and the broader community, ensuring that it is continuously curated and up to date. This centralized platform simplifies the process of finding and using plugins, enabling users to enhance their data engineering workflows with ease. 

Key features

Meltano CLI includes several features, including: 

  • Easy to setup and easy to use 
  • Pipeline creation and management 
  • Extract, transform, and load (ETL) processes 
  • Plugin management 
  • Visualization 
  • Configuration management 
  • Version control 
  • Testability 
  • Integration with other tools: It seamlessly integrates with other tools such as dbt, Singer, and Airflow, among others, to enhance your workflow.

What Data Science Dojo has for you?

Azure Virtual Machine is preconfigured with CLI plug-and-play functionality, so you do not have to worry about setting up the environment. 

  • Features include a zero-setup CLI platform that offers a high degree of customization, allowing users to define custom transformations, connectors, and integrations. It is designed to be easy to use with a simple command-line interface and intuitive user interface.
  • Meltano CLI helps you efficiently transform, convert, and validate your data using a simplified process for data engineering, with the assurance of accuracy and reliability. 

 

And many others which you check by taking a quick peek here: Meltano CLI on Azure Marketplace sets it apart from others is that it is an open-source, flexible, and scalable CLI for ELT+. It is customizable. It is also observable, provides a full view with detailed pipeline logs and statistics, and allows inspection of code for debugging. Meltano is versionable which allows easy tracking and rollback of changes. It is testable and only deploys to production once everything is green. 

Moreover, Meltano CLI is a powerful and flexible data integration tool that offers many benefits over other tools on the market. Its open-source nature, ease of use, integration with other tools, reconfigurability, and community support make it a compelling choice for data integration projects. 

Conclusion  

The Meltano CLI comes with pre-configured Ubuntu 20.04 and a ready-to-use project, allowing for a plug-and-play experience without any setup required. By using Azure, the fault tolerance of data pipelines is increased, resulting in higher performance and faster content delivery.

The Meltano CLI provides an open-source, flexible, and scalable CLI for ELT+, allowing for efficient data transformation, conversion, and validation with accuracy and reliability. When combined with Microsoft Azure services, Meltano outperforms traditional methods by performing data-intensive computations in the cloud. Collaboration and sharing of notebooks with stakeholders is also possible.

At Data Science Dojo, we deliver data science education, consulting, and technical services to increase the power of data. We are therefore adding a free project Environment dedicated specifically to Data Integration and ELT on Azure Market Place. Do not wait to install this offer by Data Science Dojo, your ideal companion in your journey to learn data science! 

Try Now

 

Written by Insiyah Talib

March 15, 2023

Related Topics

Statistics
Resources
rag
Programming
Machine Learning
LLM
Generative AI
Data Visualization
Data Security
Data Science
Data Engineering
Data Analytics
Computer Vision
Career
AI
Agentic AI