stable-diffusion.cpp

History

bssrdf a469688e30 feat: add TencentARC PhotoMaker support (#179 ) * first efforts at implementing photomaker; lots more to do * added PhotoMakerIDEncoder model in SD * fixed soem bugs; now photomaker model weights can be loaded into their tensor buffers * added input id image loading * added preprocessing inpit id images * finished get_num_tensors * fixed a bug in remove_duplicates * add a get_learned_condition_with_trigger function to do photomaker stuff * add a convert_token_to_id function for photomaker to extract trigger word's token id * making progress; need to implement tokenizer decoder * making more progress; finishing vision model forward * debugging vision_model outputs * corrected clip vision model output * continue making progress in id fusion process * finished stacked id embedding; to be tested * remove garbage file * debuging graph compute * more progress; now alloc buffer failed * fixed wtype issue; input images can only be 1 because issue with transformer when batch size > 1 (to be investigated) * added delayed subject conditioning; now photomaker runs and generates images * fixed stat_merge_step * added photomaker lora model (to be tested) * reworked pmid lora * finished applying pmid lora; to be tested * finalized pmid lora * add a few print tensor; tweak in sample again * small tweak; still not getting ID faces * fixed a bug in FuseBlock forward; also remove diag_mask op in for vision transformer; getting better results * disable pmid lora apply for now; 1 input image seems working; > 1 not working * turn pmid lora apply back on * fixed a decode bug * fixed a bug in ggml's conv_2d, and now > 1 input images working * add style_ratio as a cli param; reworked encode with trigger for attention weights * merge commit fixing lora free param buffer error * change default style ratio to 10% * added an option to offload vae decoder to CPU for mem-limited gpus * removing image normalization step seems making ID fidelity much higher * revert default style ratio back ro 20% * added an option for normalizing input ID images; cleaned up debugging code * more clean up * fixed bugs; now failed with cuda error; likely out-of-mem on GPU * free pmid model params when required * photomaker working properly now after merging and adapting to GGMLBlock API * remove tensor renaming; fixing names in the photomaker model file * updated README.md to include instructions and notes for running PhotoMaker * a bit clean up * remove -DGGML_CUDA_FORCE_MMQ; more clean up and README update * add input image requirement in README * bring back freeing pmid lora params buffer; simply pooled output of CLIPvision * remove MultiheadAttention2; customized MultiheadAttention * added a WIN32 get_files_from_dir; turn off Photomakder if receiving no input images * update docs * fix ci error * make stable-diffusion.h a pure c header file This reverts commit `27887b630d`. * fix ci error * format code * reuse get_learned_condition * reuse pad_tokens * reuse CLIPVisionModel * reuse LoraModel * add --clip-on-cpu * fix lora name conversion for SDXL --------- Co-authored-by: bssrdf <bssrdf@gmail.com> Co-authored-by: leejet <leejet714@gmail.com>		2024-03-12 23:15:17 +08:00
..
CMakeLists.txt	feat: load weights from safetensors and ckpt (#101 )	2023-12-03 15:47:20 +08:00
README.md	feat: load weights from safetensors and ckpt (#101 )	2023-12-03 15:47:20 +08:00
json.hpp	feat: load weights from safetensors and ckpt (#101 )	2023-12-03 15:47:20 +08:00
miniz.h	feat: load weights from safetensors and ckpt (#101 )	2023-12-03 15:47:20 +08:00
stb_image.h	feat: load weights from safetensors and ckpt (#101 )	2023-12-03 15:47:20 +08:00
stb_image_resize.h	feat: add TencentARC PhotoMaker support (#179 )	2024-03-12 23:15:17 +08:00
stb_image_write.h	feat: load weights from safetensors and ckpt (#101 )	2023-12-03 15:47:20 +08:00
zip.c	feat: load weights from safetensors and ckpt (#101 )	2023-12-03 15:47:20 +08:00
zip.h	feat: load weights from safetensors and ckpt (#101 )	2023-12-03 15:47:20 +08:00

README.md

json.hpp library from: https://github.com/nlohmann/json
ZIP Library from: https://github.com/kuba--/zip