Hello, there 👋🏾!
I'm an economist and data scientist with over 7 years of professional experience and more than 10 years working with data. I specialize in using quantitative methods to process, analyze, and visualize data for decision-making purposes.
I have a passion for teaching programming, and for over 3 years, I have had the privilege of delivering courses on introductory data management, information visualization, and machine learning. Currently, I'm a data science teacher at Le Wagon.
My main expertise is data analysis, visualization, and statistical modeling, and throughout the years I have gained some extra specialized skills such as:
These are some of the projects that I have developed:
While working at LNPP-CIDE:
I was in charge of media monitoring, so I created a platform to automate custom web searches and look for mentions of the institution, teachers, and research centers in digital newspapers. I developed a script able to scrape more than one hundred Mexican newspapers and gather the text of the news and their metadata (date, author, title). The platform allowed final users (the PR department) to retrieve the information and export it as Excel files for their reports.
I created a platform to scrape every day the website of the Mexican federal government public procurement. With this information, I used regular expressions to search for contracts where we as an institution could participate in making a proposal.
I created the backend for a platform that scrapped around 50 specialized newspapers to find news about innovation and new technologies in relevant economic industries to the State of Guanajuato Government.
As a freelancer:
I have scrapped thousands of news and created datasets about violence against journalists, homicides, lynchings, and mentions of public institutions.
I created an executable Windows program (.exe) to scrape baseball statistics.
As a personal project, I scraped the Diario Oficial de la Federación (DOF) and created a public dataset. The DOF is the official government gazette where official documents, laws, decrees, regulations, and notices from the Mexican federal government are published.
For these projects, I always use Python, sometimes with the Django framework and libraries as requests and BeautifulSoup.
Complex surveys analysis
I have worked with microdata from complex survey designs to calculate customized estimations and econometric models. When using these datasets, it's important to consider some elements of the survey design as the strata, FPC, and weights. Some of the surveys I have worked with are:
Population Census, Mexico
Encuesta Nacional de Ingresos y Gastos de los Hogares, ENIGH, Mexico. I used this survey to participate in the Data Mexico Challenge.
Encuesta Nacional de Población Privada de la Libertad, ENPOL, Mexico. I used this survey to participate in the Prison Datathon
Encuesta Nacional de Ocupación y Empleo, ENOE, Mexico
Encuesta Nacional de Inclusión Financiera, ENIF, Mexico.
Encuesta de Movilidad Social, ESRU-EMOVI, Mexico. I used this survey for my Master's thesis.
Demographic and Health Survey, DHS, Colombia. I used this survey for my undergraduate thesis.
When working with complex surveys, I usually use the Stata software or the R programming language.
Throughout my career, I have honed my expertise in data management and visualization, GIS analysis, and automation. I have experience using GIS tools as QGIS, ArcGIS and GeoDa to analyze and visualize geospatial data. I have also worked on projects where I have used Python to automate GIS processes, saving time and increasing efficiency.
As a data scientist, I have had the opportunity to work on a variety of projects where I have made extensive use of geospatial calculations, vector operations, and map visualizations. Specifically, I have experience in tasks such as calculating distances, areas, spatial autocorrelation, clustering, and conducting vector operations such as centroids, buffers, unions, intersections, and differences, among others. I have also developed skills in map visualization techniques such as choropleths, dot density, heatmaps, animations, and interactive visualizations. Additionally, I have experience with raster operations, including georeferencing images, conducting zonal statistics, and analyzing differences over time.
As an Economist, I took specialized courses on causal inference, using experimental and quasi-experimental techniques (difference-in-difference, propensity score matching, synthetic controls, panel data, and instrumental variables).
When working as a consultant for CLEAR-LAC, I had to evaluate the technical quality and statistical robustness of more than 20 impact evaluation reports delivered by external consultants to the IDB (Inter-American Development Bank). These reports included experimental and quasi-experimental methodologies.
When working at LNPP-CIDE, I was in charge of the statistical analysis to evaluate an experimental policy that aimed to increase COVID-19 testing in the Mexico City population sending different SMS messages.
I have worked extensively with texts, analyzing thousands of documents in tasks like:
Preprocess and clean text corpus.
Visualize the most frequent terms.
Search patterns like proper names, addresses, mentions, events, and specific grammatical structures using regular expressions.
Find main topics in a corpus of texts using machine learning techniques like LDA and clustering.
You can check my public repository (in Spanish) for the workshop that I presented to employees of the prosecutor's office of the State of Hidalgo:
As mentioned before, I created a platform for media monitoring for CIDE and, also, another one for finding public procurement.
When working at LNPP-CIDE, I created a platform for updating and querying social and economic indicators and their metadata.
When working at LNPP-CIDE:
I was the only developer (and data scientist), so I learned to deploy my projects using AWS services such as EC2, EBS, RDS, S3, Batch Jobs, Lambda Functions, and more.
For one project, I used Azure services to automate the transcription of audio interviews to text using services such as speech recognition, buckets, and step functions.
As a teacher in Le Wagon, I teach about cloud services using GCP such as VM instances, GCS, and BigQuery.
When working at LNPP-CIDE, I taught Python programming for data management, visualization, and machine learning for 3 years.
I also taught these topics to employees of reputed institutions such as Banco de Mexico (Banxico) , Comisión Federal de Competencia Económica (COFECE), and the prosecutor's office of the State of Hidalgo.
I currently teach the full sequence of Le wagon data science bootcamp, including complex topics such as Deep learning and cloud services.