flystem-usls/README.md at a0d410b46d693d133e4db1d360a3106f22512d46

1.6 KiB

Raw Blame History

This demo showcases how to use CLIP to compute similarity between texts and images, which can be employed for image-to-text or text-to-image retrieval tasks.

Quick Start

cargo run -r --example clip

Or you can manully

1.Donwload CLIP ONNX Model

clip-b32-visual
clip-b32-textual

2. Specify the ONNX model path in `main.rs`

    // visual
    let options_visual = Options::default()
        .with_model("VISUAL_MODEL")  // <= modify this
        .with_i00((1, 1, 4).into())
        .with_profile(false);

    // textual
    let options_textual = Options::default()
        .with_model("TEXTUAL_MODEL")  // <= modify this
        .with_i00((1, 1, 4).into())
        .with_profile(false);

3. Then, run

cargo run -r --example clip

Results

(90.11472%) ./examples/clip/images/carrot.jpg => 几个胡萝卜 
[0.04573484, 0.0048218793, 0.0011618224, 0.90114725, 0.0036694852, 0.031348046, 0.0121166315]

(94.07785%) ./examples/clip/images/peoples.jpg => Some people holding wine glasses in a restaurant 
[0.050406333, 0.0011632168, 0.0019338318, 0.0013227565, 0.003916758, 0.00047858112, 0.9407785]

(86.59852%) ./examples/clip/images/doll.jpg => There is a doll with red hair and a clock on a table 
[0.07032883, 0.00053773675, 0.0006372929, 0.06066096, 0.0007378078, 0.8659852, 0.0011121632]

TODO

TensorRT support for textual model

1.6 KiB Raw Blame History

Quick Start

Or you can manully

1.Donwload CLIP ONNX Model

2. Specify the ONNX model path in main.rs

3. Then, run

Results

TODO

1.6 KiB

Raw Blame History

2. Specify the ONNX model path in `main.rs`