Machine Learning
Deep dives into ML algorithms, models, and applications
// 12 articles filed
Deep dives into ML algorithms, models, and applications
// 12 articles filed
Upstage's SOLAR 10.7B uses depth upscaling - duplicating and fine-tuning Llama 2 layers - to create a model that outperforms 30B-class models on the HuggingFace leaderboard while remaining practical to serve.
Mahmudul Haque Qudrati
CEO & ML Engineer
Microsoft's Kosmos-2 produces bounding box coordinates inline with its text output, connecting every noun and phrase in its response to a specific region of the image.
Mahmudul Haque Qudrati
CEO & ML Engineer
Phi-3 Vision packs chart understanding, document analysis, and image reasoning into 4.2 billion parameters - small enough to run on a mobile device with CoreML or ONNX, yet scoring 59.8% on MMMU.
Mahmudul Haque Qudrati
CEO & ML Engineer
Stability AI's Adversarial Diffusion Distillation compresses SDXL into a 1-step model that generates 512px images in under 200ms - enabling real-time interactive generation.
Mahmudul Haque Qudrati
CEO & ML Engineer
RoBERTa improves on BERT through better pre-training - dynamic masking, no next-sentence prediction, larger batches, and more data - delivering consistent GLUE leaderboard advantages for classification tasks.
Mahmudul Haque Qudrati
CEO & ML Engineer
NVIDIA entered the foundation model market with two distinct plays: Nemotron-4 340B for synthetic data generation pipelines, and Llama-3.1-Nemotron-70B-Instruct with an Arena Hard score of 85.1% for enterprise inference.
Mahmudul Haque Qudrati
CEO & ML Engineer
Zhipu AI's CogVLM2 introduces a Visual Expert Module that gives visual tokens their own weight matrices, enabling richer image and video understanding than shared-weight alternatives.
Mahmudul Haque Qudrati
CEO & ML Engineer
Moondream2 is a 1.9B parameter vision-language model that fits in 1.2GB RAM when quantized, enabling image captioning, visual Q&A, and object detection on embedded hardware and edge devices.
Mahmudul Haque Qudrati
CEO & ML Engineer
Phi-3 Mini at 3.8B parameters outperforms Mixtral 8x7B on several benchmarks and runs in browsers via WebGPU or on Android/iOS via ONNX. Here's how.
Mahmudul Haque Qudrati
CEO & ML Engineer
Shanghai AI Lab's InternVL2-26B scores 61.2% on MMMU - within 2 points of GPT-4V - using a 6B vision encoder and dynamic high-resolution image tiling.
Mahmudul Haque Qudrati
CEO & ML Engineer
DistilBERT delivers 97% of BERT's performance at 40% smaller size and 60% faster inference, making it the practical default for production text classification that needs low latency on CPU.
Mahmudul Haque Qudrati
CEO & ML Engineer
Idefics2 is an 8B open multimodal model that handles interleaved image-text sequences, arbitrary image resolutions, and fine-tuning for document and chart understanding.
Mahmudul Haque Qudrati
CEO & ML Engineer
AI trends, techniques, and real-world implementations
How LLMs work, honest comparisons, and production usage
Every technique that works — with real examples
Claude Code, Cursor, Copilot, open-source tools reviewed honestly
Local LLMs, open models, free AI infrastructure
Fewer tokens, cheaper APIs, local alternatives with real numbers
Benchmarks explained, evaluation frameworks, model testing
LLM SEO, AI SEO, Google AI Overviews, developer marketing
iOS, Android, and cross-platform mobile app development
Modern web technologies, frameworks, and best practices
Data analysis, visualization, and engineering insights
Autonomous agents, LLM applications, and intelligent systems