Speed of Program Instance

When giving an instance of a type class, I’ve been preferring using Program Instance instead of proving a lemma then using Instance because it feels cleaner and then I won’t have a lemma I never use polluting the namespace. However, I recently found that at times using Program Instance has a noticeable slowdown compared to the lemma approach. Is this something that people have encountered before, and is there any way to fix it?