More

melondonkey · on April 16, 2024

Usually pretty scam savvy but dropped my guard and bought an absolute garbage AI translation of The Little Prince on Amazon. Now I research anything before buying

smcin · on April 17, 2024

What are the fastest ways to find telltale signs you missed?

melondonkey · on April 16, 2024

I think it’s honestly annoying how they feel they have to parenthetically add every time something is a lie or untrue. While their intention is good I think it does a service to no one and underestimates the intelligence of their listeners.

Also almost every story gets tied to either identity politics or climate change. Also just gets annoying even for those who agree. It’s like watching a movie with too much exposition dialogue.

melondonkey · on April 2, 2024

Looks like Pokemon Jirachi

melondonkey · on April 2, 2024

The cultural divide between ML engineers and “girls and gays” in data science is very real and in my experience getting worse. Good but rare when the styles can be brought together.

melondonkey · on March 30, 2024

Damn, this is like the fifth time series framework posted this week.

This one seems theoretically more interesting than some others but practically less useful. For one, who wants to do stuff in tensorflow anymore let alone tensorflow-probability. Tp has had ample time to prove its worth and from what I can tell pretty much no one is using it because of a worst of both worlds problem—DL community prefers pytorch and stats community prefers Stan.

I’m starting to feel like time series and forecasting research is just going off the rails as every company is trying to jump on the DL/LLM hype train and try to tell us that somehow neural nets know something we don’t about how to predict the future based on the past.

leventov · on March 31, 2024

What are the other four frameworks?

> For one, who wants to do stuff in tensorflow anymore let alone tensorflow-probability.

AutoBNN is a JAX library and has nothing to do technically with TF Probability. It was developed by the TF Probability team.

> DL community prefers pytorch and stats community prefers Stan.

It looks like the JAX ecosystem for stats is growing: NumPyro is based on JAX, PyMC has a JAX backend, https://github.com/blackjax-devs/blackjax has effective samplers, there is https://github.com/jax-ml/bayeux, and now AutoBNN.

> This one seems theoretically more interesting than some others but practically less useful.

Are there other factors why you think AutoBNN is not practically useful, apart from being based on the wrong foundation (which was a mistaken belief of yours)?

leventov · on March 31, 2024

Answering my own question about the other frameworks (one also from Google, BTW): https://exa.ai/search?c=news&q=time%20series%20prediction%20...

melondonkey · on March 29, 2024

Hard to meet everyone where they are and at the same time give them a relevant practical application for their own life. Good learners just soak it up and look for the application later. But that also doesn’t fit all. It’s hard to even write a pop song that everyone likes so math education that appeals to all is almost impossible

melondonkey · on March 27, 2024

Data scientist here that’s also tired of the tools. We put so much effort in trying to educate DSes in our company to get away from notebooks and use IDEs like VS or RStudio and databricks has been a step backwards cause we didn’t get the integrated version

mrtranscendence · on March 27, 2024

I'm a data scientist and I agree that work meant to last should be in a source-controlled project coded via a text editor or IDE. But sometimes it's extremely useful to get -- and iterate on -- immediate results. There's no good way to do that without either notebooks or at least a REPL.

alexott · on March 27, 2024

There is VSCode extension, plus databricks-connect… plus DABs. There are a lot customers doing local only development

pandastronaut · on March 27, 2024

Thank you ! I am so tired of all those unmaintainable nor debugable notebooks. Years ago, Databricks had a specific page on their documentation where they stated that notebooks where not for production grade software. It has been removed. And now you have a chatgpt like in their notebooks ... What a step backwards. How can all those developers be so happy without having the bare minimum tools to diagnosis their code ? And I am not even talking about unit testing here.

alexott · on March 27, 2024

It’s less about notebooks, but more about SDLC practices. Notebooks may encourage writing throwaway code, but if you split code correctly, then you can do unit testing, write modular code, etc. And ability to use “arbitrary files” as Python packages exists for quite a while, so you can get best of both worlds - quick iteration, plus ability to package your code as a wheel and distribute

P.S. here is a simple example of unit testing: https://github.com/alexott/databricks-nutter-repos-demo - I wrote it more than three years ago.

melondonkey · on March 26, 2024

One detail I don’t really understand is the low-variance normal component of the target mixture. Would be curious to see from the weights how often that was used

melondonkey · on March 26, 2024

I know. Here I am modeling my data generating process like a chump.

melondonkey · on March 26, 2024

Just needs a less engineering-oriented DS role and will be fine. Consulting is a good way to work in lots of industries and try things on.