Python Landscape

From Sustainability Methods
(Redirected from Sandbox)

EDITION MODE. REFERENCES ARE MISSING. THERE ARE TYPOS.

Welcome to the Python Wiki for Data Analysis

Python has become one of the most popular programming languages because of its simplicity, readability and versatility. It is a high-level, interpreted programming language created by Guido van Rossum and first released in 1991. Python emphasizes code readability and syntax that allows programmers to express concepts in fewer lines of code than possible in languages such as C++ or Java, therefore, for aspiring data analysts or scientists, Python represents a convenient entry into the programming world because it can be learned and mastered in a relative short period of time.

What can you do with Python?

With Python you can conduct simple and complex data analysis, design and launch websites, and even design artificial intelligence algorithms. This diagram is just a glimpse of the areas of application when working with python. Click on the different parts of the diagram to be redirected to the respective pages.

Setting up the Python Environment DATAx Python Basics Python Intermediate Python Advanced Data Analysis Data Engineering Version Control and Management Statistics with Python Artificial Intelligence Data Visualization Data Application Development Geospatial Modelling
Python landscape clickable diagram. Click on the desired area to visit the respective page. Source: own.


The above diagram consist of the following sections:

  • Setting up the Python Environment: Probably the most important section. Here you will learn how to set up properly your working environment for your data projects: downloading python and libraries, using package managers and selecting the suitable Integrated Development Environment.
  • DATAx: This section is targeted to first semester Leuphana students who are taking the DATAx course. Here are curated Python wiki entries mostly regarding data analysis.
  • Python Basics: In this section you will learn the data types and basic commands in Python, from printing "Hello World",creating loops, conditionals, functions to manipulating string data.
  • Python Intermediate: In this section you will learn about data types, operations with lists and dictionaries, the differences between local and global variables and debugging errors.
  • Python Advanced: In this advanced level, you will refine and expand your Python skills by learning classes, comprehensions, attributes, concurrency, etc.
  • Data Analysis: By going deep with the pandas libraries, you will learn from importing, cleaning, and manipulating data, to carrying out preliminary analysis of it. This section goes is intrinsically related to statistical analysis with Python and data visualization sections.
  • Statistics with Python: For data analysis you will definitively use statistics. From descriptive to inferential statistics, from univariate to multivariate statistics, this section show you the Python libraries and functions for your statistical analysis workflow. Moreover, in this section you can also find wiki entries related to math and linear algebra in Python.
  • Data Visualization: By diving into libraries such as Matplotlib, Plotly and Seaborn, you will learn basic and advanced functions to create appealing visualization for your data analysis and storytelling.
  • Data Engineering: This section is about collecting data from internet and other sources by creating robust datapipelines. From using APIs, applying web scrapping and data mining to storing it in a SQL database.
  • [[Version Control and Management|Version Control and Management]: As you advance with your data journey, you will realize the need of managing properly your files, scripts, notebooks and documents. This section will help you with that, for example, how to use Git and Github.
  • Artificial Intelligence: This section will introduce you to machine learning, natural language processing, deep learning and large language models, with the respective Python libraries such as scikit-learn, tensorflow, pytorch, nltk, spacy, langchain among others.
  • Data Application Development: This section will guide you to create web application interfaces where you can show your data projects to the world. From creating dashboards to AI chatbots, you will learn how to achieve this using libraries such as streamlit or taipy.
  • Geospatial Modelling: If you need to deal with geospatial data, this section will help you analyse your data, for example, using geopandas, and create maps.

Why is it important to learn Python nowadays?

Python is widely used in industry, particularly in data science, web development, automation, and artificial intelligence. Learning Python opens up numerous career opportunities and it is now considered an essential component in data literacy.

Learning programming is now easier than before. There are hundreds with all books, websites, and communities that have been helping people to learn Python. As said above, this language is straightforward to learn which helps new programmers pick up the basics quickly.

Now with the emergence of Artificial Intelligence applications, learning programming has become less frustrating and more efficient. However, even without AI, Python has characterised by having a incredible documentation of all its libraries and frameworks. It only requires a quick google search to access the specific library documentation you need.

Why a Python Wiki when there are AI and numerous resources?

If you want to know how to start with Python, you would probably go to chatgpt, youtube, community blogs or other paid resources such as coursera, datacamp, etc. Although you can get very useful information and roadmaps from chatgpt and data scientists youtubers, there is a problem of consistency regarding the technical language they use and their approaches.

The team behind this wiki has had the same problem and navigated across hundreds of these resources. This wiki is bringing order to chaos, and to provide users a basic roadmaps that you can complement (with a critical perspective) with other materials on the internet.

Moreover, we are aware that Python is not perfect, and there are other programming languages such as R that can serve as a good complement. In this sense, this Python wiki is also linked to other R and statistical analysis entries.

For whom is this Python wiki targeted?

The purpose of this Wiki is to offer an overview and structured roadmap for those people who want to undertake their journey in Python and Data, for example:

  • Students who are enrolled in courses where Python is the primary programming language used for data manipulation, analysis, and visualization.
  • Individuals who have some experience with R and want to understand the significance of Python and how it complements their existing skills.
  • Professionals from other fields looking to transition into data science.
  • Instructors who are teaching introductory programming, data science, or related fields will find these entries useful for explaining the importance and applications of Python to their students.


Created by: J. Gustavo Rodriguez Aboytes