Difference between revisions of "Python Landscape"

From Sustainability Methods
m (Gustavo moved page Sandbox to Python Landscape: Removing the ")
 
(29 intermediate revisions by the same user not shown)
Line 1: Line 1:
Python is a high-level, interpreted programming language known for its simplicity and readability. It's widely used in various fields, from web development to data science and artificial intelligence.
+
== Welcome to the Python Wiki for Data Analysis ==
  
== Setting up your environment ==
+
Python has become one of the most popular programming languages because of its simplicity, readability and versatility. It is a high-level, interpreted programming language created by Guido van Rossum and first released in 1991. Python emphasizes code readability and syntax that allows programmers to express concepts in fewer lines of code than possible in languages such as C++ or Java, therefore, for aspiring data analysts or scientists, Python represents a convenient entry into the programming world because it can be learned and mastered in a relative short period of time.
* [[Anaconda|Install and setup Anaconda]]
 
* Setting up Jupyter Lab
 
  
== Starting with Python basics ==
 
* [[Conditions and Branching in Python|Conditions]]
 
* [[Loops in Python|Loops]]
 
* [[Lists, Tuples, Sets, and Dictionaries in Python|Lists, Tuples, Sets, and Dictionaries]]
 
* [[Exceptions in Python|Exceptions]]
 
* [[Functions in Python|Functions]]
 
* [[Objects and Classes in Python|Classes]]
 
* [[Objects in Python|Objects]]
 
* [[Types, Expressions, and Variables in Python|Types, Expressions, and Variables]]
 
  
== Doing some math with Python ==
+
== What can you do with Python? ==
* [[Mathematical Functions in Python|Mathematical Functions in Python]]
+
With Python you can conduct simple and complex data analysis, design and launch websites, and even design artificial intelligence algorithms. This diagram is just a glimpse of the areas of application when working with python. Click on the different parts of the diagram to be redirected to the respective pages.
* [[Modelling Initial Value Problems in Python|Modelling Initial Value Problems]]
 
  
== Handling data in Python ==
+
<imagemap>
* [[Introduction to Pandas|Introduction to Pandas]]
+
File:Python landscape3.png|800px|thumbnail|center|Python landscape clickable diagram. Click on the desired area to visit the respective page. Source: own.
* [[Data Inspection in Python|Data Inspection]]
+
rect 887 260 1662 494 [[Setting up Python Work Environment|Setting up the Python Environment]]
* [[Handling Missing Values in Python|Missing values]]
+
rect 30 619 594 844 [[Python Basics|Python Basics]]
* [[Reading and Writing Files in Python|Reading and Writing]]
+
rect 888 624 1658 823 [[Python Intermediate|Python Intermediate]]
 +
rect 1968 628 2615 816 [[Python Advanced|Python Advanced]]
 +
rect 214 1043 781 1140 [[Data Analysis with Python|Data Analysis]]
 +
rect 1044 1003 1798 1133 [[Data Engineering with Python|Data Engineering]]
 +
rect 1914 975 2525 1181 [[Version Control and Management|Version Control and Management]]
 +
rect 172 1270 1075 1366 [[Statistics with Python|Statistics with Python]]
 +
rect 1331 1281 2272 1382 [[Artificial Intelligence with Python|Artificial Intelligence]]
 +
rect 243 1487 930 1636 [[Data Visualization with Python|Data Visualization]]
 +
rect 1345 1501 2376 1631 [[Data Application Development with Python|Data Application Development]]
 +
rect 942 1728 1776 1837 [[Geospatial Modelling with Python|Geospatial Modelling]]
 +
</imagemap>
  
== Statistics with Python ==
 
* [[Binomial distribution|Binomial Distribution]]
 
* [[Bootstrapping in Python|Bootstrapping]]
 
* [[Exploring Different Correlation Coefficients and Plotting Correlations in Python|Correlations]]
 
* [[Decision Trees in Python|Decision Trees]]
 
* [[Factor Analysis|Factor Analysis]]
 
* [[Linear Regression in Python|Linear Regression]]
 
* [[Multiple Regression in Python|Multiple Regression]]
 
* [[Outlier Detection in Python|Outlier Detection]]
 
* [[Poisson Distribution in Python|Poisson Distribution]]
 
* [[Regression, Correlation, and Ordinary Least Squares Estimator in Python|Regression, Correlation, and Ordinary Least Squares Estimator]]
 
  
== Data Visualization ==
+
The above diagram consist of the following sections:
Data visualization is very important because...
 
* [[Scatterplots in Python|Scatterplots]]
 
* [[Time Series Data in Python|Time Series Data]]
 
  
== Machine Learning ==
+
* [[Setting up Python Work Environment|Setting up the Python Environment]]: Probably the most important section. Here you will learn how to set up properly your working environment for your data projects: downloading python and libraries, using package managers and selecting the suitable Integrated Development Environment.
* [[Handling Categorical Data in Python|Data Encoding]]
 
  
== Data Engineering ==
+
* [[Python Basics|Python Basics]]: In this section you will learn the data types and basic commands in Python, from printing "Hello World",creating loops, conditionals, functions to manipulating string data.
* [[Web Scraping in Python|Web Scraping]]
 
Use of API
 
Cloud computing
 
  
== Data Storytelling ==
+
* [[Python Intermediate|Python Intermediate]]: In this section you will learn about data types, operations with lists and dictionaries, the differences between local and global variables and debugging errors.
* [[How to Lie with Statistics|How to Lie with Statistics]]
 
  
== Project Management in Python ==
+
* [[Python Advanced|Python Advanced]]: In this advanced level, you will refine and expand your Python skills by learning classes, comprehensions, attributes, concurrency, etc.
* [[How to write unreadable und unmaintainable code|Code Manteinance]]
 
* [[Git and Github|Git and Github]]
 
* [[Data Versioning with Python|Data Version Control]]
 
* [[Multi-Criteria Decision Making in Python|Multi-Criteria Decision Making]]
 
* [[Object Relational Mapping in Python|Object Relational Mapping]]
 
* [[Structured Query Language in Python|Structured Query Language (SQL)]]
 
  
[[Category:Python basics]]
+
* [[Data Analysis with Python|Data Analysis]]: By going deep with the pandas libraries, you will learn from importing, cleaning, and manipulating data, to carrying out preliminary analysis of it. This section goes is intrinsically related to statistical analysis with Python and data visualization sections.
 +
 
 +
* [[Statistics with Python|Statistics with Python]]: For data analysis you will definitively use statistics. From descriptive to inferential statistics, from univariate to multivariate statistics, this section show you the Python libraries and functions for your statistical analysis workflow. Moreover, in this section you can also find wiki entries related to math and linear algebra in Python.
 +
 
 +
* [[Data Visualization with Python|Data Visualization]]: By diving into libraries such as Matplotlib, Plotly and Seaborn, you will learn basic and advanced functions to create appealing visualization for your data analysis and storytelling.
 +
 
 +
* [[Data Engineering with Python|Data Engineering]]: This section is about collecting data from internet and other sources by creating robust datapipelines. From using APIs, applying web scrapping and data mining to storing it in a SQL database.
 +
 
 +
* [[Version Control and Management|Version Control and Management]]: As you advance with your data journey, you will realize the need of managing properly your files, scripts, notebooks and documents. This section will help you with that, for example, how to use Git and Github.
 +
 +
* [[Artificial Intelligence with Python|Artificial Intelligence]]: This section will introduce you to machine learning, natural language processing, deep learning and large language models, with the respective Python libraries such as scikit-learn, tensorflow, pytorch, nltk, spacy, langchain among others.
 +
 
 +
* [[Data Application Development with Python|Data Application Development]]: This section will guide you to create web application interfaces where you can show your data projects to the world. From creating dashboards to AI chatbots, you will learn how to achieve this using libraries such as streamlit or taipy.
 +
 
 +
* [[Geospatial Modelling with Python|Geospatial Modelling]]: If you need to deal with geospatial data, this section will help you analyse your data, for example, using geopandas, and create maps.
 +
 
 +
== Why is it important to learn Python nowadays? ==
 +
 
 +
Python is widely used in industry, particularly in data science, web development, automation, and artificial intelligence. Learning Python opens up numerous career opportunities and it is now considered an essential component in data literacy.
 +
 
 +
Learning programming is now easier than before. There are hundreds with all books, websites, and communities that have been helping people to learn Python. As said above, this language is straightforward to learn which helps new programmers pick up the basics quickly.
 +
 
 +
Now with the emergence of Artificial Intelligence applications, learning programming has become less frustrating and more efficient. However, even without AI, Python has characterised by having a incredible documentation of all its libraries and frameworks. It only requires a quick google search to access the specific library documentation you need.
 +
 
 +
== Why a Python Wiki when there are AI and numerous resources? ==
 +
 
 +
If you want to know how to start with Python, you would probably go to chatgpt, youtube, community blogs or other paid resources such as coursera, datacamp, etc. Although you can get very useful information and roadmaps from chatgpt and data scientists youtubers, there is a problem of consistency regarding the technical language they use and their approaches.
 +
 
 +
The team behind this wiki has had the same problem and navigated across hundreds of these resources. This wiki is bringing order to chaos, and to provide users a basic roadmaps that you can complement (with a critical perspective) with other materials on the internet.
 +
 
 +
Moreover, we are aware that Python is not perfect, and there are other programming languages such as R that can serve as a good complement. In this sense, this Python wiki is also linked to other R and statistical analysis entries.
 +
 
 +
== For whom is this Python wiki targeted? ==
 +
 
 +
The purpose of this Wiki is to offer an overview and structured roadmap for those people who want to undertake their journey in Python and Data, for example:
 +
 
 +
* Students who are enrolled in courses where Python is the primary programming language used for data manipulation, analysis, and visualization.
 +
* Individuals who have some experience with R and want to understand the significance of Python and how it complements their existing skills.
 +
* Professionals from other fields looking to transition into data science.
 +
* Instructors who are teaching introductory programming, data science, or related fields will find these entries useful for explaining the importance and applications of Python to their students.
 +
 
 +
== References ==
 +
To be added
 +
 
 +
<hr>
 +
Authors of this entry is  J. Gustavo Rodriguez Aboytes
 +
 
 +
 
 +
[[Category:Python]]

Latest revision as of 16:06, 11 November 2024

Welcome to the Python Wiki for Data Analysis

Python has become one of the most popular programming languages because of its simplicity, readability and versatility. It is a high-level, interpreted programming language created by Guido van Rossum and first released in 1991. Python emphasizes code readability and syntax that allows programmers to express concepts in fewer lines of code than possible in languages such as C++ or Java, therefore, for aspiring data analysts or scientists, Python represents a convenient entry into the programming world because it can be learned and mastered in a relative short period of time.


What can you do with Python?

With Python you can conduct simple and complex data analysis, design and launch websites, and even design artificial intelligence algorithms. This diagram is just a glimpse of the areas of application when working with python. Click on the different parts of the diagram to be redirected to the respective pages.

Setting up the Python Environment Python Basics Python Intermediate Python Advanced Data Analysis Data Engineering Version Control and Management Statistics with Python Artificial Intelligence Data Visualization Data Application Development Geospatial Modelling
Python landscape clickable diagram. Click on the desired area to visit the respective page. Source: own.


The above diagram consist of the following sections:

  • Setting up the Python Environment: Probably the most important section. Here you will learn how to set up properly your working environment for your data projects: downloading python and libraries, using package managers and selecting the suitable Integrated Development Environment.
  • Python Basics: In this section you will learn the data types and basic commands in Python, from printing "Hello World",creating loops, conditionals, functions to manipulating string data.
  • Python Intermediate: In this section you will learn about data types, operations with lists and dictionaries, the differences between local and global variables and debugging errors.
  • Python Advanced: In this advanced level, you will refine and expand your Python skills by learning classes, comprehensions, attributes, concurrency, etc.
  • Data Analysis: By going deep with the pandas libraries, you will learn from importing, cleaning, and manipulating data, to carrying out preliminary analysis of it. This section goes is intrinsically related to statistical analysis with Python and data visualization sections.
  • Statistics with Python: For data analysis you will definitively use statistics. From descriptive to inferential statistics, from univariate to multivariate statistics, this section show you the Python libraries and functions for your statistical analysis workflow. Moreover, in this section you can also find wiki entries related to math and linear algebra in Python.
  • Data Visualization: By diving into libraries such as Matplotlib, Plotly and Seaborn, you will learn basic and advanced functions to create appealing visualization for your data analysis and storytelling.
  • Data Engineering: This section is about collecting data from internet and other sources by creating robust datapipelines. From using APIs, applying web scrapping and data mining to storing it in a SQL database.
  • Version Control and Management: As you advance with your data journey, you will realize the need of managing properly your files, scripts, notebooks and documents. This section will help you with that, for example, how to use Git and Github.
  • Artificial Intelligence: This section will introduce you to machine learning, natural language processing, deep learning and large language models, with the respective Python libraries such as scikit-learn, tensorflow, pytorch, nltk, spacy, langchain among others.
  • Data Application Development: This section will guide you to create web application interfaces where you can show your data projects to the world. From creating dashboards to AI chatbots, you will learn how to achieve this using libraries such as streamlit or taipy.
  • Geospatial Modelling: If you need to deal with geospatial data, this section will help you analyse your data, for example, using geopandas, and create maps.

Why is it important to learn Python nowadays?

Python is widely used in industry, particularly in data science, web development, automation, and artificial intelligence. Learning Python opens up numerous career opportunities and it is now considered an essential component in data literacy.

Learning programming is now easier than before. There are hundreds with all books, websites, and communities that have been helping people to learn Python. As said above, this language is straightforward to learn which helps new programmers pick up the basics quickly.

Now with the emergence of Artificial Intelligence applications, learning programming has become less frustrating and more efficient. However, even without AI, Python has characterised by having a incredible documentation of all its libraries and frameworks. It only requires a quick google search to access the specific library documentation you need.

Why a Python Wiki when there are AI and numerous resources?

If you want to know how to start with Python, you would probably go to chatgpt, youtube, community blogs or other paid resources such as coursera, datacamp, etc. Although you can get very useful information and roadmaps from chatgpt and data scientists youtubers, there is a problem of consistency regarding the technical language they use and their approaches.

The team behind this wiki has had the same problem and navigated across hundreds of these resources. This wiki is bringing order to chaos, and to provide users a basic roadmaps that you can complement (with a critical perspective) with other materials on the internet.

Moreover, we are aware that Python is not perfect, and there are other programming languages such as R that can serve as a good complement. In this sense, this Python wiki is also linked to other R and statistical analysis entries.

For whom is this Python wiki targeted?

The purpose of this Wiki is to offer an overview and structured roadmap for those people who want to undertake their journey in Python and Data, for example:

  • Students who are enrolled in courses where Python is the primary programming language used for data manipulation, analysis, and visualization.
  • Individuals who have some experience with R and want to understand the significance of Python and how it complements their existing skills.
  • Professionals from other fields looking to transition into data science.
  • Instructors who are teaching introductory programming, data science, or related fields will find these entries useful for explaining the importance and applications of Python to their students.

References

To be added


Authors of this entry is J. Gustavo Rodriguez Aboytes