MLC Chat is a great iPhone app for running models (it's on Android too) and currently ships with Llama 3.2 3B Instruct - not the version Meta released today, its a quantized version of their previous release.
I wouldn't be surprised to see it add the new ones shortly, it's quite actively maintained.
This was just recently open sourced and is pretty nice. Only issue I've had is very minor UI stuff (on Android, sounds like it runs better on iOS from skimming comments)
I'm on Android, however my somewhat elaborate solution was to install Ollama on my home laptop computer and then ssh in when I want to query a model. I figured that'd be better for my phone battery. Since my home computer is behind NAT I run yggdrasil on everything so I can access my AI on the go.