metal : use autoreleasepool to avoid memory leaks (llama/5437)

There appears to be a known memory leak when using the `MLTCommandBuffer`. It is suggested to use `@autoreleasepool` in [1,2] [1] https://developer.apple.com/forums/thread/662721 [2] https://forums.developer.apple.com/forums/thread/120931 This change-set wraps the `ggml_metal_graph_compute` in a `@autoreleasepool`. This commit addresses https://github.com/ggerganov/llama.cpp/issues/5436
2024-02-10 02:53:28 -08:00 · 2024-02-10 02:53:28 -08:00 · 47dfe9d4db
parent 1d3270cc8f
commit 47dfe9d4db
1 changed files with 2 additions and 0 deletions
--- a/ggml-metal.m
+++ b/ggml-metal.m
@ -696,6 +696,7 @@ static bool ggml_metal_graph_compute(
        struct ggml_metal_context * ctx,
               struct ggml_cgraph * gf) {

+    @autoreleasepool {
    MTLComputePassDescriptor * edesc = MTLComputePassDescriptor.computePassDescriptor;
    edesc.dispatchType = MTLDispatchTypeSerial;

@ -2281,6 +2282,7 @@ static bool ggml_metal_graph_compute(
        [[MTLCaptureManager sharedCaptureManager] stopCapture];
    }

+    }
    return true;
 }