Founded by Josh Reich and Drew Conway, the New York Open Statistical Programming Meetup started as the New York R Meetup with a handful of people in an office at Union Square Ventures. Since then it has grown to over 11,000 members and has been hosted at NYU, Columbia, AOL, iHeartRadio, eBay, Work-Bench and other locations.

Our mission is to spread knowledge of statistical programming techniques in open-source languages such as R, Python, Julia and Go, and data science in general. Another important aspect is community building and socializing. The meetups start with pizza, followed by a 45-90 minute talk, ending with a trip to the local bar.

Attending

To attend please visit the meetup page.

Presentations and Videos

Whenever possible we make presentations available at the Presentations page.

We now stream and host videos of meetups on Facebook and YouTube and older videos are scattered on a variety of services. They are also listed on the Presentations page.

Jobs

Job openings and other announcements are on the meetup discussion board.

Upcoming Meetup

Comparing {lightgbm} to other R GBDT Libraries

May 17, 2021 07:00:00 PM

For this meetup we turn to LightGBM, a competitor to xgboost and catboost.

Thank you to EcoHealth Alliance for providing the Zoom link.

Conversations during the meetup are encouraged in the monthly-meetup-chat channel in the nyhackr slack: https://nyhackr.org/slack.html

About the Talk:
In this talk, attendees will learn about LightGBM, a popular gradient boosted decision tree (GBDT) framework. The talk begins with an overview of LightGBM and the features that allow it to be fast without sacrificing accuracy. After those fundamentals, attendees will learn about how {lightgbm} compares to two other popular GBDT projects with R packages: {catboost} and {xgboost}. That portion of the talk will cover why you might choose one library over the others, and will discuss issues ranging from ease-of-installation and data loading to algorithmic details like handling of sparse features and strategies used to decide which splits to evaluate.

About James:
James Lamb is an engineer at Saturn Cloud, where he works on a team building a managed Dask + Kubernetes product. He is a maintainer on LightGBM, and has made many contributions to other open source data science projects, including xgboost and prefect. He holds masters degrees in Applied Economics (2014) and Data Science (2018). Before joining Saturn, he worked as an IoT Data Scientist at Amazon Web Services and Uptake.

GitHub: https://github.com/jameslamb
LinkedIn: https://www.linkedin.com/in/jameslamb1/
Twitter: https://twitter.com/_jameslamb

The talk will begin at 7 PM EDT and we will start admitting people to the event shortly before. Since this is completely remote there will be no pizza but everyone is encouraged to have pizza individually.

Website

The nyhackr website was built as a RMarkdown website and the source code can accessed by the community on GitHub.

How to contribute

If you wish to contribute to the website the process is pretty simple.

  1. Fork and clone the repository (an example can be found here)
  2. Create a new branch for your changes (warning, this step cannot be done in RStudio!)
  3. Make your changes. You can build and view your local version by using rmarkdown::render_site()
  4. When you are done, submit a pull request. Your changes might not appear on the public site right away as we have a development version for making sure changes don’t break the site.