Spleeter – software to generate “stems” from final mix

  • This topic is empty.
Viewing 15 posts - 1 through 15 (of 26 total)
  • Author
    Posts
  • #397297
    DragRedSim
    Participant

      This came across my timeline today, and I immediately thought it would be useful for this community.

       

      https://github.com/deezer/spleeter

       

      This is a tool that claims to be able to split sources into “stems” via similar techniques to some of Harmonix’s post-RB4 DLC output, by isolating the frequencies of each part. The difference is that it attempts to do so automatically using machine learning, with a dataset already having been created.

       

      This may be useful in producing better-sounding customs. Hopefully it will help some of you still out there authoring!

      #507722
      jerrylive365
      Participant

        Very interesting. Will have to give this a look.

        #507729

        i didn’t know I needed this in my life

        #507730

        I’m interested to see what kind of results people get with this software.

        #507743
        MrPrezident
        Moderator

          Very cool! Thanks for sharing this info.

          Keeping the content Canadian since 2017!

          SomeOldGuys: https://db.c3universe.com/songs/all/__user/someoldguys
          MrPrezident: https://db.c3universe.com/songs/all/__user/MrPrezident

          #507748
          ataeaf
          Participant

            I have trouble getting this to work. Either the audio files aren’t produced or they end way too early.

            #507758
            Anonymous

              This came across my timeline today, and I immediately thought it would be useful for this community.

               

              https://github.com/deezer/spleeter

               

              This is a tool that claims to be able to split sources into “stems” via similar techniques to some of Harmonix’s post-RB4 DLC output, by isolating the frequencies of each part. The difference is that it attempts to do so automatically using machine learning, with a dataset already having been created.

               

              This may be useful in producing better-sounding customs. Hopefully it will help some of you still out there authoring!

               

              Hmmm… Linux Based?! – this is real challenge too hunt down the bugs, very thanks for this :smug: :smug: :smug: :rock: :rock: :rock:

              #507772
              yaniv297
              Keymaster

                I’ve been using RX7 for this and results have generally been good, especially for drums and vocals.

                #507790
                StackOverflow0x
                Participant

                  I tried this and I’m just blown away by how easily it outputs great results! Drums and vocals have turned out really well! Bass too. Basically free karaoke versions of songs with this tool for the most part. And for the bass extraction, that should help for charting since it’s usually the hardest to hear and comes out fairly well too.

                  #507810
                  Shroud
                  Participant

                    Bah, doesn’t work for me…

                     

                    I always get “WARNING:spleeter:cannot reshape array of size 0 into shape (0)” which sounds like a low-level python error, but all I am doing is their basic suggested command for separating 2,4 or 5 stems.

                    #507811
                    StackOverflow0x
                    Participant

                      I know it sounds complicated to set up, but I just followed the instructions exactly as they provided. Download the right version for whatever version of Python you have installed. Download Conda. Use git to get everything. A more common error I’ve seen is the input path in the arguments to run it being incorrect. You can drag and drop a file into the console window to get the path in there if that’s faster.

                      #507812
                      Shroud
                      Participant

                        I know it sounds complicated to set up, but I just followed the instructions exactly as they provided.

                         

                        They don’t really have good instructions. I know this is open source, which is great, but often it means the documentation is non-professional.

                         

                        However… eventually I DID get some results, and they are surprisingly good!

                         

                        Their example file worked, so I thought there might be some format requirements, which I didn’t notice from the documentation. Comparing the example file with the few mp3 I have and I was using to test the program, I noticed that all my mp3 had variable bit rate, while the example mp3 had fixed bitrate.

                         

                        So I tried a fixed-rate mp3 and it worked! Well, partially… first time it crashed in the middle of processing (but at least it started), perhaps because of lack of RAM. I know I have an old PC with only 4Gb of RAM, and I know that some intensive processing must be required for this task… but really how is it possible that to process a ~4 MEGA byte file the program uses 4 GIGA byte of memory? It actually does, because it my second try, after closing every possible other program in order to free more RAM, I monitored spleeter and it went up to more than 3.5 Gb of usage at some point.

                         

                        This time it didn’t crash, but it got stuck at some point, however it did manage to get as far as producing some of the supposed audio files: vocals, drums and piano (I was trying the 5-stem processing version). The third one is empty, but it could be that it got stuck after creating the file but before filling it with data. Vocals and drums are VERY good. I don’t know if the models used by the program are for a specific music genre, but I tried it with a pop-synth song that I have no idea if it’s supposed to be a good or bad choice for spleeter (“Adamski – Killer”). If you’re familiar with the song, the drums are synthesized not real acoustic drums, but they came up pretty well. The voice is also very well isolated, not perfectly but essentially you would only hear a few “beeps” around the vocals, not processing artifacts but small sounds which are in the original audio, which apparently confused the program a bit, but they are not particularly annoying.

                         

                        The processing is slow (more than half an hour but it’s hard to tell when it got stuck exactly) because I am using the CPU option and, as I said, I have an old PC. There is an option to use the GPU which is claimed to take a time shorter than the song duration itself, but requires to have an Nvidia graphic card specifically, I have a GeForce 550 Ti but it’s unbranded so I am not sure it will work.

                        #507826
                        Shroud
                        Participant

                          More tests and more failures… with variable bitrate mp3 it just never works for me, and with fixed bitrate mp3 it always gets stuck between generating stem files. Sometimes no stem files are generated at all, some other times I get 3 or 4 (voice, drums, other and I may get piano or not), while it never generates the bass stem. There is a slim chance it’s still processing but I let it go for over 2 hours and it didn’t show a sign of life…

                           

                          One interesting thing is why does it generate the “other” stem (basically the background track) before other stems. I would have expected that the algorithm identifies and extracts stems of known instruments first, and then saves whatever remains into other.wav.

                           

                          I focused on testing the 5:stems option but probably I should use the 4:stems option instead, because now I understand that the “piano” stem in 5:stems mode is really looking for piano sounds in the audio, possibly including electric piano but not keyboards in general, so if the song does not include piano specifically, the 4:stems mode should be used instead.

                           

                          And yes, unfortunately there is no standard way to extract a guitar stem, apparently due to guitar sounds being too variable in music. But the whole software is based on machine learning, and with a proper dataset (meaning, if you have plenty of guitar stems from other songs, which I don’t) it should be possible to train the algorithm to recognize the guitar as well. Perhaps this would need to be done multiple times for different guitar sounds (i.e. with various combinations of the usual guitar effects such as distortion, reverb, chorus, flanger, etc…), but I would bet that if Spleeter manages to attract enough interest, then people will start to train/create and share new modes for extracting other instruments.

                           

                          For RB3 custom authors, there is great potential here… I have my own issues with running the algorithm but no reason to believe the same happens to others, so if you don’t count those issues, extracting the stems literally takes one single command line, and you don’t need to mess with any options other than picking the model you want – to recap, the current models are 2:stems (vocals + rest), 4:stems (vocals + drums + bass + rest) and 5:stems (vocals + drums + bass + piano + rest).

                          #507851
                          Anonymous

                            This will time consuming but this is the great tool to split up and may be isolation of track, this job well get done

                            #507859
                            Shroud
                            Participant

                              This will time consuming but this is the great tool to split up and may be isolation of track, this job well get done

                               

                              I can’t get the GPU mode working, which supposedly speeds up the whole processing. The CPU mode still crashes often, generally takes about half an hour to produce results, and in most cases it doesn’t output all the stems, so I have to repeat the stems generation a few times per song. Overall I expect a couple of hours to process one song, but it is really “time consuming” only for my PC, not for me because all I have to do is write a command line and wait for the stems… so despite all the flaws I can accept the situation <img decoding=” src=”/wp-content/uploads/invision_emoticons/default_SA_smile.gif” />

                               

                              Now the real question is whether the stems produced are usable for RB3 customs. When you listen to the separate stems, they sound pretty bad, so I wonder whether players missing notes in-game for a few seconds will make the audio also sound bad. But if you listen to all stems simultaneously, they definitely add up to the original (good) audio.

                            Viewing 15 posts - 1 through 15 (of 26 total)
                            • You must be logged in to reply to this topic.
                            Back to top button